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Preface 

This  thesis  presents  a system  of  computer  programs.  They 
are  designed  for  student  use.  However,  their  design  is  modular, 
the  code  is  ANSI  FORTRAN  and  the  common  8080  assembler.  This 
design  was  selected  to  make  the  system  transportable.  Over  4000 
source  lines  are  included.  If  a user  does  not  require  the  complete 
system,  individual  routines  may  easily  be  extracted. 

Some  notes  of  appreciation  are  due.  Charlie  Dutra,  Tom 
Gabrielle,  Gene  Mechler,  and  Professor  V.  0.  McBrien  all  partici- 
pated in  educating  me  and  in  creating  the  opportunity  for  this 
thesis.  My  typist,  Ms.  Nancy  Myers,  produced  an  amazing  transforma- 
tion in  the  manuscript  in  almost  no  time  at  all.  The  members  of 
n\y  thesis  comini ttee  have  graciously  endured  my  moments  of  confu- 
sion and  given  solid  support.  I am  thankful  for  Professor  Richard's 
careful  comments  and  Dr.  Hartrum's  understanding.  Without 
Dr.  Kabri sky's  perspicacious  underwriting  not  even  the  statement  of 
my  objectives  in  this  bottom  line  would  exist.  I sincerely  thank 
all  who  have  helped  me. 

A special  note  follows:  MJ,  Jack,  Amy,  Moira,  Nancy  - 
your  patience  with  me  has  been  magnificent.  You  have  my  promise 
that  'the  best  is  yet  to  be.'  Thank  you. 

John  R.  Leary 
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A tool  for  developing  microprocessor  based  pattern  recog- 
nizers is  presented.  A two  segment  system  of  programs  is  imple- 
mented. One  segment  is  a subsystem  consisting  of  a generalized 
pattern  classifier  program  and  utility  routines  for  an  INTEL 
SBC  80/20  microprocessor  system.  The  other  segment  is  a sub- 
system of  four  interactive  programs.  These  four  programs  support 
feature  selection,  pattern  class  definition  and  performance 
evaluation  using  procedures  fitted  to  the  classifier  algorithm. 

This  subsystem  operates  on  a user  supplied  file  of  feature  vectors. 

It  produces  a class  defining  structure  for  use  by  the  classifier. 

It  can  use  a TEKTRONIX  4014  for  graphics  support  and  will  operate 
interactively  within  the  CDC  6600  Intercom  partition.  Structured 
design,  modular  code,  buffer  allocation  algorithms,  and  ANSI 
standard  FORTRAN  code  make  this  segment  transportable.  The 
classifier  segment  requires  an  8080  system.  Less  than  256  bytes 

t 

of  ROM  are  used.  Data  buffer  locations  and  sizes,  the  number  of 
classes  and  the  number  of  features  are  specified  by  the  user. 

Experiments  produced  estimates  of  classifier  performance  for  this 
system.  An  error  rate  of  less  than  ten  percent  is  reported  for 
one  26  class  character  recognition  experiment. 
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A DEVELOPMENT  SYSTEM  FOR 
MICROPROCESSOR  BASED  PATTERN  RECOGNIZERS 

I.  Introduction 

This  thesis  presents  a development  system  for  use  as  a 
design  tool  in  implementing  experimental  pattern  recognizers. 

Some  characteristics  of  pattern  recognizers  are  described  in  the 
next  chapter.  The  art  of  designing  a pattern  recognition  system 
is  also  discussed  in  that  chapter.  The  development  system  produced 
for  this  thesis  is  discussed  in  the  three  following  chapters. 

In  chapter  three,  functional  requirements  are  established. 

Chapter  four  defines  the  algorithms  upon  which  this  system  is 
based.  In  chapter  five,  the  design  and  use  of  the  system  is 
documented.  Two  questions  remain  to  be  addressed.  Their  answers 
justify  the  above  discussion.  First,  of  what  value  are  pattern 
recognition  systems  to  the  Air  Force?  And  second,  how  does  this 
system  relate  to  such  Air  Force  pattern  recognizers? 

In  a recent  issue  of  Air  University  Review,  Dr.  Paul  Namin 
explores  the  military  need  for  Identification  Friend,  Foe,  Neutral 
(IFFN)  systems.  He  makes  the  point  that  without  such  systems 
there  is  a serious  limitation,  i.e.,  the  rule  of  visual  engagement, 
which  restricts  the  degree  to  which  the  potential  of  any  weapons 
system  can  be  realized.  An  anecdote  illustrates  his  point.  It 


tells  of  the  destruction  of  a multimillion  dollar  weapons  system 
while  its  pilot  is  unaware  of  any  threat.  Namin  hypothesizes 
that  this  might  occur  because  of  marginal  enemy  advantage  in 
target  detection  capability.  He  then  suggests  that  solutions  to 
the  technology  problem  posed  by  IFFN  need  not  necessarily  seek 
new  sensor  phenomena.  Rather,  he  holds  that  a more  effective 
integration  of  sensor  data  may  be  produced  by  enhancements  to 
signal  processing  systems  and  "shrinkage  in  device  cost  and  size." 
This  is  the  synergistic  effect  of  "getting  more  performance  out 
of  a collection  of  data  than  any  one  of  them  can  provide."  (Ref  9) 
This  may  be  the  general  military  application  for  pattern  recog- 
nition systems.  At  each  node  of  a complex  network  of  sensors 
may  lie  a pattern  recognizer.  It  reduces  volumes  of  higher  level 
data  into  simple  classification  statements  which  funnel  through 
the  network  as  command  and  control  status  items.  Namin' s IFFN  is 
"a  technological  challenge  for  the  '80s."  Classification  inputs 
to  status  networks  begin  with  simple  pattern  recognizers 
applied  to  small  pieces  of  the  complex  electromagnetic  warfare 
environment. 

The  development  system  presented  in  this  thesis  is  a simple 
one.  It  is  primarily  pedagogical,  and  is  intended  for  AFIT 
student  use  in  exploration  of  experimental  solutions  to  specific 
recognition  problems.  But  the  concept  and  the  configuration  of 
this  system  are  also  aimed  at  the  practical  problem  of  cheaply 
implementing  prototype  pattern  recognizer  systems.  Such 
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prototypes  may  provide  sufficient  empirical  knowledge  of  key 
sensor  data  environments  for  the  ultimate  implementation  of 
reliability  systems. 


II.  Concepts 

This  chapter  presents  the  theoretical  foundations  for  the 
thesis.  Following  a brief  statement  of  notation  conventions  and 
some  definitions,  design  of  pattern  recognizers  is  discussed  in 
general.  Concepts  relevant  to  classification  algorithms  are  next 
presented.  Then  the  selection  of  pattern  features  is  discussed. 
Finally  two  types  of  pattern  recognition  applications  are  described. 

Notation 

Several  definitions  and  notation  conventions  make  this 
report  easier.  Assume  that  any  pattern  environment  may  consist 
of  N patterns.  These  patterns  may  separate  into  I sets  whose 
members  share  some  degree  of  commonality.  Each  of  these  sets  of 
patterns  will  be  known  as  a pattern  class.  An  arbitrary  pattern 
class  may  contain  L members.  Any  individual  pattern  may  be 
represented  by  J characteristic  features.  If  these  features  are 
considered  as  an  ordered  J-tuple,  an  individual  pattern  can  be 
represented  by  a feature  vector  having  J components.  These 
vectors  will  be  referenced  as  1XJ  row  matrices  when  th  s is  con- 
venient. A population  of  feature  vectors  collected  from  the 
pattern  environment  will  be  described  as  a data  base  or  data  set, 
and  denoted  tt.  This  collection  can  be  separated  into  disjoint 
classes.  Each  of  these  will  be  denoted  m.  An  arbitrary  feature 
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vector  in  the  population  n will  be  denoted  F . Similarly,  an 
arbitrary  feature  vector  in  an  arbitrary  class  will  be  denoted 
F^.  A consistent  use  of  the  single  subscripts  n and  z will  over- 
come any  possible  ambiguity  in  specification  of  feature  vector 
class  membership.  These  definitions  are  summarized  in  the 
notation  below. 


where 

« = lFn|l<n<N} 

(2-1) 

Fn  = (fl»  * * * fj’  • • * f J") 

I 

(2-2) 

n - U w, 

1 = 1 1 

(2-3) 

where 

F*  e wi 

(2-4) 

and 

w.  n uk  = 4>  for  all  i,  k when  i ? k 

(2-5) 

Symbol  conventions  are  implicit  in  this  notation.  Vector  com- 
ponents and  scalar  values  are  represented  by  lower  case  letters. 
Subscripts  are  used  only  when  needed  to  clarify  significant 
differences  and  not  used  to  establish  a trail  of  relationships. 

Thus  F^  is  a member  of  uk  and  context  will  suffice  to  identify  the 
vector  of  which  fj  is  a component.  With  the  exception  of  the  index 
limits  N,  I,  J,  and  L,  only  vectors  and  matrices  are  denoted  by 
capital  letters.  The  transpose  of  the  usual  I x J row  matrix  F 
to  a J x I column  matrix  is  denoted  FT.  There  is  one  exception 
to  the  convention  for  denoting  matrices.  The  symbol  is  used 
to  denote  the  wi thin-class  covariance  matrix  for  class  w..  This 
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covariance  is  estimated  by: 

I 

*1  ' f £,  Vt\  - p1Tpi>  <2-61 

, L 

where  P.  = f £ F..  (2-7) 

1 L «.=  1 

In  equation  (2-6),  the  notation  F^F^  indicates  a square  J x J 
matrix.  Also,  F^F^  denotes  the  scalar  which  is  the  square  of 
vector  magnitude. 

The  definitions  above  make  possible  explanations  of  several 
concepts  upon  which  this  thesis  is  based. 

Pattern  Recognizer  Design 

To  recognize  a pattern  is  to  perceive  it  as  something 
previously  known.  With  this  simple  statement  Webster  suggests 
what  Kanal  (Ref  23:701)  emphasizes  as  a major  evolution  of  the 
last  few  years:  the  design  of  a pattern  recognition  system  has 
come  to  be  highly  iterative  process.  A major  part  of  this  design 
process  is  acquiring  necessary  and  sufficient  prior  knowledge. 

A major  problem  in  this  design  process  is  deciding  exactly  i 

what  knowledge  is  necessary  and  how  much  of  that  is  sufficient 
for  pattern  recognition.  This  decision  is  made  through  a two-path 
modeling  process. 

Box  (Ref  2:24)  discusses  a philosophy  of  model  building. 

Fig  1 presents  his  three  stage  procedure  to  find  adequate  models 
from  known  data.  In  pattern  recognition  the  data  are  the  patterns 
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of  interest.  Here  two  paths  produce  a classification  model  and 
a representation  model.  These  are  respectively  equivalent  to 
Webster's  present  perception  and  previous  knowledge.  In  the  one 
path,  features  model  the  patterns.  In  the  other  path,  class 
defining  structures  model  the  pattern  environment.  Through  the 
former  we  come  to  know  the  latter. 

Box  explains  his  procedure  as  follows.  In  the  first  stage 
system  knowledge  is  used  to  hypothesize  tentative  models.  Here 
statistically  inefficient  methods  are  used  because  precise 
formulations  are  not  yet  available.  In  the  second  stage,  parameters 
are  estimated  for  the  tentative  model.  Non-linear  least  squares 
procedures  are  used  to  estimate  these  parameters  and  then  covar- 
iance matrices.  After  fitting  the  tentative  model  to  observed 
data,  in  the  third  stage,  the  fitted  model  is  checked  in  relation 
to  the  observations  so  as  to  reveal  model  inaccuracies  and  achieve 
improvement.  Inspection  of  error  functions  indicates  whether  the 
entertained  model  is  adequate,  or  if  and  how  the  model  is  to  be 
revised.  After  diagnostic  checks  satisfy  the  user  as  to  model 
adequacy,  the  derived  model  is  used. 

The  appeal  of  Box's  process  lies  in  its  generality.  It 
applies  equally  well  to  each  path.  Fig  2 shows  these  paths. 

Clearly  these  paths  are  not  independent.  Production  of  an  error 
rate  requires  both  features  and  a class  defining  structure. 

Obviously  the  class  defining  structure  is  built  in  terms  of 
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features.  However,  feature  identification  does  not  end  once  a 
class  defining  structure  has  been  derived.  Nor  is  pattern  recog- 
nizer design  complete  once  an  error  rate  has  been  validated. 

This  is  the  point  of  this  general  discussion. 

The  two  paths  of  Fig  2 lead  into  the  next  two  sections  of 
this  chapter.  They  cover  feature  selection  and  pattern  classi- 
fication. Pattern  classification  is  presented  first. 

Pattern  Classification 

Put  simply,  in  terms  of  the  notation  stated  earlier,  the 
task  of  a pattern  classifier  is  to  assign  an  unknown  pattern 
from  an  unknown  data  set  H'  to  that  class  w..  C 0 with  whose  mem- 
bers F^  shares  the  greatest  similarity.  This  assignment  can 
be  made  in  several  ways.  Bayesian  classifiers,  minimum  distance 
and  nearest  neighbor  classifiers  are  germane  to  this  thesis. 

Bayesian  Classifiers.  In  these  classifiers  the  a priori 
probability  of  and  the  class-conditional  probability  density 
functions  of  the  members  of  class  w..  are  explicitly  known. 

Decision  functions  di  (Fn)  are  used  to  establish  class  membership. 
That  is,  the  probability  of  misclassification  is  minimum  when 

W = P<Fnh>  Pr  1 = 1.  • • • r (2-8) 

is  a maximum  with  respect  to  a choice  of  i.  Therefore 

dk(Fn)  = max  (d.(Fn)}  - Fn£  u)k  (2-9) 
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In  this  expression  the  a priori  probability  is  often  assumed 
identical  for  each  class.  It  is  also  common  to  assume  the  multi- 
variate normal  density  which  is 

p(Fn|Wi)  = (2TrJ/2|z.|?5)"1exp(-3£(Fn-Pi)z^1(Fn-Pi)T)  (2-10) 

The  symbols  Fn,  P^ , J and  on  are  all  used  as  earlier  defined. 

Using  equation  (2-9)  a decision  function  can  be  written  using  the 
monotonic  log  function  to  simplify  the  exponential  form  of  the 
Gaussian  density. 

di(Fn)  = ln[Pr(uw)]-^ln|Ei|-MFn-Pi)E:1(Fn-P.)T  (2-11) 

(Dividing  all  p (Fn/w^)  by  does  not  change  their  relative 

i 

magnitudes.)  In  a Bayes  classifier,  the  set  of  decision  functions 
relates  the  unknown  pattern  to  all  classes.  The  maximum  decision 
function  provides  the  index  of  the  class  to  which  the  unknown 
feature  vector  is  assigned  (Ref  13:13). 

Minimum  Distance  Rules.  Many  classification  procedures 
can  be  said  to  follow  this  technique.  The  simplest  of  them  first 
establish  a prototype  for  each  class.  Then  the  unknown  is  assigned 
to  that  class  whose  prototype  is  closest,  in  a Euclidean  distance 
sense,  to  the  unknown.  This  rule  requires  two  assumptions.  One  is  that 
in  F£  and  F£+1  e w. , the  vector  (F£-  F£+1)  is  also  in  <»>.  (Ref  12:11). 
This  concept  is  required  to  justify  the  usual  choice  of  the  centroid 
of  the  class  as  its  prototype.  It  also  supports  the  second 
assumption  which  is  that  similarity  between  pattern  is  consistently 
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reflected  by  the  Euclidean  metric  on  the  feature  space.  This 
rule  can  be  concisely  stated  as  follows: 

(2-12) 

dk(Fn>  = "j»  < V Pil}  * Fr>  c “i 

where  1 <_  i <_  I. 

Nearest  Neighbor  (NN)  Classifiers.  Fix  and  Hodqes 
(Ref  11)  are  credited  with  suggesting  a variant  of  this  classifi- 
cation rule.  Again  a set  of  distances  are  computed  for  the 
unknown  Fn.  However,  the  assumption  that  the  members  of  a class 
form  a convex  set  is  not  needed.  This  is  because  the  measured 
distances  relate  Fp  to  each  Ffc  within  each  wi . The  unknown 
pattern  is  assigned  to  the  class  which  contains  its  nearest 
neighbor.  The  assumption  that  the  Euclidean  metric  consistently 
reflects  pattern  similarity  must  still  exist.  The  rule  is  robust 
since  it  can  be  sensitive  to  any  actual  distribution  of  F£  given 
that  ft  is  sufficient.  If  a vote  is  taken  among  the  K nearest 
neighbors  of  Fp  then  a K-NN  rule  is  said  to  be  used.  The  risk 
of  error  in  this  latter  rule  tends  to  the  Bayes  risk  as  K and  N 
tend  to  infinity.  Das  Gupta  (Ref  9:15)  notes  that  NN  rules  are 
also  related  to  rules  ba^'  estimates  of  density  functions. 

The  obvious  problems  with  the  NN  rule  are  a sensitivity  to  bad 
data  points,  and  a computational  cost  for  data  storage  and 
execution  time  which  tends  to  become  excessive  as  the  NN  risk 
tends  toward  the  Bayesian  risk. 
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Comments o Three  comments  on  classification  rules  establish 
a perspective  for  the  algorithms  developed  in  this  thesis. 

(1)  Das  Gupta  (Ref  9:15)  notes  that  the  usefulness  of  a 
classification  rule  is  determined  by  its  simplicity  as  well  as  its 
robustness.  Although  conceptual  simplicity  is  useful  in  that  a 
rule  may  be  easily  understood,  computational  simplicity  produces 
the  efficiency  which  permits  a rule  to  be  used  effectively  in 
practice. 

(2)  There  are  complicated  treatments  of  indecision  zones 
and  tolerance  regions  which  may  be  asymptotically  optimal  for 
large  numbers  of  classes  (Ref  9:13).  These  may  justify  the 
simplistic  approach  of  covering  the  feature  space  with  as  many 
"tight"  subclasses  as  possible  in  order  to  optimize  classification. 

(3)  Chen  (Ref  4:6)  notes  that  experimental  results  have 
established  that  there  is  always  a small  subset  of  good  learning 
samples  which  dominate  performance.  This  possible  insensitivity 
to  sample  size  of  good  quality  neighborhoods  can  lead  to  an 
experimental  procedure.  In  it,  one  uses  analytical  intuition  to 
uncover  the  kernel  of  good-neighbor  patterns  which  may  define  the 
optimal  class.  Undesirable  samp1es  can  be  said  to  belong  to  the 
"husk"  of  such  a class.  The  idea  is  an  outgrowth  of  that  of  the 
edited  or  condensed  NN  rule  which  attempts  to  eliminate  samples 
on  the  wrong  side  of  class  boundaries. 


13 


4 


I 


Feature  Selection 

The  term  "identify"  was  used  deliberately  in  the  first 
block  of  the  features  path  in  Fig  2.  It  covers  extraction  of 
measurements  which  characterize  digitized  pattern  data.  It  also 
encompasses  the  selection  of  the  minimum  subset  of  these  values 
which  is  adequate  for  acceptable  classification.  Extraction  is 
a problem  dependent  task.  The  more  general  question  of  selec- 
tion is  addressed  below. 

The  problem  here  is  essentially  one  of  computational  bene- 
fit. The  number  of  features  extracted  from  the  pattern  data  is 
often  deliberately  too  great.  (See  Chapter  4 under  benchmarks.) 

This  leaves  a need  to  reduce  the  measurement  set  to  one  whose 
size  is  manageable.  There  are  many  possible  subsets.  The  total 
number  to  be  evaluated  when  j features  are  selected  from  0 
features  is 

T ' $ ‘ ilWIT  ‘2’13> 

There  are  many  techniques  which  have  been  applied  to  this 
evaluation.  The  problem  is  one  of  choosing  a better  subset.  It 
is  an  accepted  fact  that  there  is  only  one  guaranteed  way  to  find 
the  best  subset.  Cover  has  shown  this  to  be  exhaustive  search 
(Ref  8:117).  Jain  reports  that  added  features  may  actually  degrade 
the  performance  of  a classifier.  Thus  subset  selection  is  moti- 
vated by  more  than  an  interest  in  computational  efficiency  (Ref  21:1). 


I 
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Subset  selection  methods  are  basically  search  procedures. 
There  is  basic  agreement  that  the  best  control  on  such  a search 
procedure  is  to  estimate  probability  of  error  by  computing  the 
empirical  error  rate  on  a large  test  data  set  (Ref  34:72).  The 
simplest  subset  selection  algorithms  establish  a figure  of  merit 
for  each  feature  and  then  pick  the  best  n features.  Sequential 
ordering  processes  are  used  to  reduce  computation.  Chen  (Ref  3:89) 
notes  that  dynamic  programming  is  a good  technique  for  sequential 
search.  He  states  that  the  search  for  one  best  feature  at  a 
time  is  computationally  the  most  efficient.  Stearns  describes 
the  bias  that  may  unintentionally  derive  from  previous  selections 
in  such  a search.  Sequential  searches  produce  nests  of  subsets 
in  which 

^1 c ^2  *“  • • • ~*n‘ 

Features  that  are  "powerful"  in  early  stages  remain  in  the  final 
set  even  though  they  may  no  longer  be  needed.  He  suggests  a 
"plus  m,  take  av/ay  n"  search  to  avoid  the  fact  that  the  two  best 
features  may  not  be  the  best  pair  (Ref  34). 

In  summary,  computational  cost  is  a key  factor  in  subset 
selection.  The  most  critical  element  of  any  search  procedure 
appears  to  be  evaluation  of  error  probability.  This  is  best 
estimated  by  an  empirical  error  rate.  Finally,  while  nested 
selection  procedures  may  bias  results,  they  offer  efficiency  of 
implementation. 

( 
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Pattern  Recognition  Applications 

The  algorithms  implemented  for  this  thesis  are  evaluated  in 
terms  of  two  differing  applications  of  pattern  recognizers  in 
Chapter  IV.  A brief  background  on  these  different  applications  is 
given  below. 

Character  Recognizers.  Considerable  work  has  been  done 
at  AFIT  in  investigating  techniques  which  apply  to  the  recogni- 
tion of  two-dimensional  data.  In  these  efforts  features  have 
been  extracted  from  various  digital  representations  of  pictures 
using  the  two-dimensional  Fourier  transform.  This  is  consistent 
with  the  work  of  Kabrisky  whose  research  produced  a model  of  the 
human  visual  system  (Ref  22).  Tallman's  dissertation  indicates 
that  hardprinted  characters  can  be  recognized  by  use  of  low 
frequency  filtered  Fourier  components  (Ref  35).  Efforts  by 
Sponaugle  to  generalize  this  work  towards  recognition  of  multi- 
font typeset  letter  data  are  the  basis  for  test  data  and  benchmark 
comparisons  given  later  in  this  report  (Ref  33). 

Waveform  Recognizers.  Signal  classification  can  use  pattern 
recognition  techniques  to  advantage.  Feucht's  recent  article  in 
Computer  Design  is  motivated  by  this  fact  (Ref  10:68).  Hall  and 
Bouvier  produced  AFIT  theses  dealing  successfully  with  waveform 
pattern  recognizers  (Refs  14,  1).  Radar  signature  pattern 
recognizers  are  found  in  Air  Force  operations.  The  classifier 
algorithm  implemented  for  this  thesis  was  originally  designed  by 


the  author  for  use  in  a Space  Object  Identification  application 
(Ref  25).  Many  of  the  procedures  present  in  this  thesis  are 
> eclectic  outgrowths  of  the  synergy  of  that  development  project. 

These  range  from  the  concept  of  biased  samples  to  which  Chen 
attests  (Ref  4:60)  to  the  use  of  asymmetric  class  boundaries 
(Ref  32).  Finally,  a sample  of  radar  signatures  was  used  by 
Kulchak  (Ref  24)  to  produce  the  Frequency  of  Binary  Words  (FOBW) 
feature  vectors  referenced  later  in  this  report. 


i 


17 


III.  Requirements 


In  this  chapter  the  structure  of  the  thesis  is  developed. 
The  goals  and  objectives  of  the  project  are  stated.  These  are 
addressed  in  a short  discussion  of  underlying  assumptions.  There- 
after follows  a statement  of  the  functional  requirements  for  the 
development  system  produced  in  this  effort.  A bubble  chart  is 
presented  and  used  to  explain  the  concept  of  system  data  flow 
upon  which  this  development  system  is  based.  A short  statement 
of  design  and  coding  standards  is  then  given.  Selection  of  a 
name  for  the  system  concludes  the  chapter. 

Goals  and  Objectives 

The  ultimate  purpose  of  this  thesis  is  to  support  experi- 
mental implementation  of  microprocessor  based  pattern  recognizers. 
Meeting  this  goal  requires  production  of  a system  of  programs. 

This  system  is  intended  to  be  a designer's  tool.  As  such,  it  aims 
to  facilitate  the  process  of  recognizer  development,  and  to  drive 
that  development  towards  a specific  microprocessor  implementa- 
tion. The  system  is  also  intended  to  be  used  and  modified  by 
students  as  they  develop,  experiment  with,  and  investigate  pattern 
recognition  algorithms. 

In  order  to  achieve  these  goals,  three  specific  develop- 
ment objectives  are  stated  for  the  system.  Its  design  is  required 
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to  model  a key  recognizer  element,  the  pattern  classifier.  To 
simplify  student  implementation  of  pattern  recognizers,  this 
model  classifier  is  to  be  programmed  for  a specific  microprocessor. 
The  system  design  is  also  required  to  generalize  the  process  of 
deriving  a class  defining  data  structure.  The  classifier  bases 
its  decisions  upon  this  structure.  Thus,  system  error-rate  is  a 
function  of  this  structure.  Effective  generalization  of  this 
process  makes  the  system  an  effective  tool  for  designers  of 
pattern  recognizers  in  general.  Finally,  a series  of  benchmark 
performance  measurements  are  required.  These  demonstrate  the 
system  as  a framework  for  both  potential  users  and  experimenters. 
They  also  serve  to  qualify  system  worth.  All  of  these  require- 
ments boil  down  to  three  specifics: 

(1)  Design  and  implement  a pattern  classifier  for  a 
microprocessor  system. 

(2)  Design  and  implement  the  supporting  functions  necessary 
to  generate  the  class  defining  data  structure  with  which  the 
classifier  can  make  acceptable  decisions. 

(3)  Experimentally  demonstrate  the  above. 

Assumptions 

The  worth  of  the  goal  set  for  the  above  becomes  clear  in  a 
discussion  of  several  assumptions.  This  follows. 
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Microprocessors  are  readily  available,  inexpensive,  and 
small  in  size.  Small  microprocessor  systems  can  become  elements 
of  large  networks.  These  systems  can  be  interfaced  to  large 
random  access  memories  (RAM),  disk  storage,  and  high  speed  pro- 
cessing technology.  In  the  light  of  Namin' s concept  which  intro- 
duced this  report,  one  should  therefore  assume  that  microprocessors 
must  be  addressed  by  any  effort  to  upgrade  sensor  data  processing. 

The  task  of  implementing  a pattern  recognizer  crosses 
many  disciplines.  Data  processing  obstacles  can  be  major  ones  to 
individuals  otherwise  highly  qualified  to  analytically  determine 
significantly  discriminating  pattern  features.  The  task  of  tuning 
an  optimal  classifier  or  generating  a class  defining  structure 
may  similarly  sidetrack  would-be  designers  whose  talents  tend 
towards  the  more  critical  task  of  designing  efficient  feature 
extraction  hardware.  Given  these  postulates,  the  worth  of  a 
general  purpose  design  tool  with  a pre-selected  classifier  algor- 
ithm becomes  clear.  This  argument  strengthens  considerably  when 
the  would-be  designer  is  a thesis  student  pressed  by  time. 

Pre-selection  of  a simplistic  classifier  as  an  element  of 
a recognizer  system  may  provide  a benefit  aside  from  its  economy. 

An  optimum  classifier  can  only  optimize  the  processing  of  its  in- 
put features.  It  may  well  be  far  more  critical  to  the  implemen- 
tation of  successful  pattern  recognizers  to  place  limited  "model -T" 
systems  in  the  environment  than  to  initially  seek  high  performance 
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systems.  The  search  for  better  input  features  becomes  tedious  and 
intractable  without  a computer  yardstick  for  their  evaluation. 

What  better  yardstick  is  there  than  the  performance  of  a "model -T" 
classifier  which  operates  in  the  actual  data  environment?  The 
answer  to  the  foregoing  question  is  obviously  moot.  Future  experi- 
ments may  resolve  it. 

Required  Functions 

The  specific  objectives  stated  above  were  analyzed  in  the 
light  of  the  concepts  and  techniques  of  pattern  recognition  which 
were  presented  in  the  previous  chapter.  Broad  functional  require- 
ments were  thus  derived  to  accomplish  the  stated  objectives.  These 
functional  requirements  were  then  studied  with  data  processing  and 
software  design  considerations  in  mind.  From  this  effort  a data 
flow  diagram  was  produced  which  reflects  the  overall  system  opera- 
tion. This  data  flow  diagram  and  the  functions  it  embodies  are 
described  in  the  following  paragrapns. 

System  Segments.  The  system  should  consist  of  two  segments. 
One,  a Classifier  Segment,  should  implement  the  selected  pattern 
classifier  design  in  a microprocessor.  The  other,  an  Interpreter 
Segment,  should  implement  those  functions  required  to  interpret  a 
sample  data  set  of  feature  vectors  in  such  a way  as  is  required 
to  produce  a class  defining  data  structure  fit  for  the  classifier. 
The  specific  functional  requirements  for  each  of  these  segments 
are  stated  in  the  two  paragraphs  below. 
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(1)  The  Classifier  Segment  should  consist  of  software  which 
resides  in  a microprocessor.  This  software  should  implement  the 
classifier  and  its  supporting  routines.  It  should: 

(a)  be  able  to  assign  unknown  patterns  to  their 
proper  classes  with  an  acceptable  error-rate. 

(b)  be  able  to  record  classification  decisions. 

(c)  be  coded  so  as  to  be  independent  of  the  loca- 
tions and  sizes  of  buffers  required  for  feature  data  and  for  the 
class  defining  structure. 

(d)  be  coded  so  as  to  be  independent  of  the  number 
of  features  and  the  number  of  classes  which  comprise  a given 
appl i cation. 

(e)  be  implemented  within  less  than  256  bytes  of 
memory  to  allow  storage  within  one  ROM  data  page  of  100H  locations. 

(2)  The  Interpreter  Segment  should  consist  of  software 
which  can  be  used  as  readily  as  possible  to  produce  a class  de- 
fining structure  for  the  former  segment.  In  this  sense  it  should 

(a)  be  coded  in  FORTRAN  using  a top-down  structured 

V 

design,  and  conforming  as  closely  as  possible  to  ANSI  standards 
so  as  to  maximize  intelligibility,  modifiability,  and  transport- 
ability. 

(b)  be  able  to  adjust  the  size  of  memory  buffers 
used  for  data  files  and  internal  structures  to  fit  the  size  of 
user  resources. 
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(c)  be  able  to  generate  and  to  refine  a class 
defining  data  structure  which  fits  the  classifier  segment. 

(d)  be  able  to  select  and  evaluate  a subset  of 
pattern  features  for  its  capacity  in  discriminating  between  pattern 
classes. 

(e)  be  able  to  support  analytical  evaluation  of 
class  and  feature  characteristics. 

(f)  be  able  to  support  efficient  transfer  of  the 
class  defining  structure  to  microprocessor  storage. 

(g)  be  able  to  produce  and  document  a simulated 
error-rate  for  the  microprocessor  implementation  of  the  classifier. 

(h)  be  able  to  operate  in  either  an  interactive  or 
a batched  computer  process. 

System  Data  Flow.  An  analysis  of  the  data  processed  by 
the  system  led  to  the  bubble  chart  presented  in  Figure  3.  This 
chart  reflects  the  requirement  for  two  system  segments  and  indi- 
cates their  conceptual  and  physical  interface.  The  Interpreter 
Segment  processes  feature  data  and  generates  class  definitions. 
These  two  data  types  are  the  primary  system  currency.  Class 
definitions  are  denoted  prototypes  for  convenience.  These  are 
based  upon  the  feature  vector  data  provided  to  the  system.  These 
latter  data  are  organized  for  efficient  system  use  in  the  process 
labeled  "CREATE"  on  the  figure.  Multiple  feature  vector  files 
provide  a capacity  to  store  test  samples,  segregate  patterns 
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FIG  3.  SYSTEPI  BUBBLE  CHART  - DATA  FLOW 


typical  of  data  classes,  and  subset  the  overall  data  set  into 
manageable  pieces.  Class  definitions,  or  prototypes,  are  produced 
by  the  process  labeled  "DEFINE"  on  the  chart.  This  process  allows 
refinement  of  specific  prototypes  by  selective  use  of  input  feature 
data.  The  capacity  of  the  complete  class  defining  structure  to 
assign  feature  vectors  to  their  proper  classes  is  measured  by  a 
classification  error-rate.  This  is  documented  by  the  process 
labeled  "TRYOUT"  on  the  figure.  This  same  process  supports  selec- 
tion of  feature  subsets,  and  evaluation  of  these  subsets  in  terms 
of  their  respective  classification  error-rates.  The  process 
labeled  "FORMAT"  on  the  chart  configures  the  class  defining  struc- 
ture for  transfer  to  the  classifier  segment.  It  also  satisfies 
the  requirement  to  support  analysis  of  feature  data  by  producing 
various  graphic  displays.  These  include  three-dimensional  plots 
of  histogram  data  produced  by  the  "CREATE"  and  the  "DEFINE" 
processes.  These  displays  reflect  the  distribution  of  values 
occurring  within  a given  feature  dimension  both  within  the  entire 
data  set  and  within  a given  class.  The  basic  process  of  the 
classifier  segment  is  reflected  by  the  label  "DECIDE"  on  the  figure. 
This  process  receives  its  input  from  the  sensor  environment 
through  a process  which  is  implicit  on  the  chart.  This  is  the 
process  of  feature  vector  generation  which  is  assumed  to  operate 
in  a parallel  and  controlling  relation  to  the  "DECIDE"  process. 
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Standards  are  applied  to  ensure  that  the  system  which  is 
produced  meets  general  requirements.  That  is,  it  must  be  intelli- 
gible, modifiable,  and  transportable.  These  requirements  affect 
software  design  and  program  coding. 

Design.  The  expression  of  requirements  in  this  chapter 
illustrates  the  key  design  standard  to  be  applied  to  the  develop- 
ment of  this  system.  This  standard  requires  that  design  decisions 
be  made  in  a structured  sequence.  In  this  process,  basic  ideas 
are  successively  decomposed  into  subordinate  concepts.  These 
concepts  are  refined  and  the  process  is  repeated  until  it  has 
produced  concrete  tasks,  specifications  and  definitions.  The 
process  is  called  structured  design  by  IBM  (Ref  20).  Earlier, 
Niklaus  Wirth  termed  it  development  by  stepwise  refinement  (Ref  36). 
Applied  to  the  design  of  computer  software,  the  technique  requires 
that  the  functions  of  a program  solution  first  be  specified. 

Then  the  data  processed  by  each  function  are  identified.  Finally, 
functional  relationships  are  determined.  Program  and  data  speci- 
fications are  refined  in  parallel.  Binding  decisions  about 
process  logic  and  data  representation  are  delayed  as  long  as 
possible.  Thus  the  advantages  of  various  data  formats  become  clear 
in  contrast  to  one  another.  Processing  paths  are  produced  by 
choice  and  not  forced  by  prior  decision  or  arbitrary  assumption. 
Wirth  justifies  his  technique  of  stepwise  refinement  with  the 
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argument  that  it  produces  a degree  of  modularity  which  greatly 
eases  program  adaptation  to  changes  of  purpose,  function,  or 
operating  environment.  This  modularity  therefore  becomes  a support- 
ing requirement  to  ensure  the  modifiability  and  transportability 
of  the  system. 

Programming.  Adherence  to  American  National  Standards 
Institute  (ANSI)  FORTRAN  standards  facilitates  transportability. 

Use  of  structured  programming  conventions  enhances  intelligibility, 
modifiability  and  transportability.  Use  of  these  standards  and 
conventions  is  therefore  a supporting  requirement™ 

ANSI  FORTRAN  standards  are  clearly  defined  for  CDC  FORTRAN 
IV  (Ref  7).  This  FORTRAN  includes  ANSI  standard  X3. 9-1966 . Since 
FORTRAN  is  a well-used  and  documented  language,  these  standards 
are  widely  exceeded  by  off-the-shelf  compilers.  Therefore  adherence 
to  the  standard  often  imposes  a restriction.  Some  of  the  more 
important  cases  in  which  CDC  FORTRAN  IV  should  be  restricted  for 
this  project  are  listed  below. 

(1)  Input/output  syntax  will  use  the  syntax  READ  (u,f) 
iolist  or  WRITE  (u,f)  iolist  as  defined  by  CDC. 

(2)  Data  labels  will  be  restricted  to  six  characters. 

(3)  Data  statements  will  not  use  implicit  loop  syntax. 

(4)  Hollerith  constants  will  only  appear  in  data  state- 
ments or  subroutine  call  statements,  and  will  use  the  nH  syntax 
as  defined  by  CDC. 
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(5)  Array  references  will  be  consistent  with  dimension 
specifications. 

(C)  Only  sequential  file  access  logic  will  be  used. 

(7)  Subscript  expressions  will  be  avoided. 

(8)  Mixed  mode  expressions  will  be  avoided. 

(9)  Mon-standard  system  functions  arid  subroutines  will  be 
avoided. 

(10)  Deviations  from  ANSI  standards  will  be  commented  in 
the  program  code. 

Structured  programming  conventions  are  guidelines  which 
simplify  program  construction  as  much  as  they  enhance  program 
modifiability.  FORTRAN  does  not  admit  such  key  structured 
programming  constructs  as  the  DO-WHILE.  Moreover,  FORTRAN  pro- 
vides a GOTO  construct  which  must  be  used  at  times.  However, 
inasmuch  as  possible  structured  programming  technique  will  be 
used.  When  logic  structures  are  complex,  indentation  will  be 
used.  The  code  will  be  segmented  as  much  as  possible.  Each 
subroutine  will  have  a single  entry  and  a single  exit.  Module 
sizes  will  be  limited  to  one  page  if  possible.  Logic  flow  will 
be  sequential,  with  imbedded  procedure  calls,  as  much  as  possi- 
ble. To  ensure  intelligibility  of  the  program  code,  a ratio  of 
at  least  one  explanatory  comment  to  each  seven  source  lines 
will  be  maintained.  Finally,  meaningful  names  will  be  used  wher- 
ever possible. (Ref  20:8-1). 
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System  Marne 

Consistent  with  the  last  convention  stated  above,  the 
name  assigned  to  this  development  system  should  be  descriptive. 

An  8080  microprocessor  system  is  available  to  support  this 
project.  The  system's  classifier  segment  will  be  coded  to 
operate  on  this  microprocessor  system.  This  classifier  is  de- 
fined in  the  following  chapter.  It  references  n dimensional 
rectangular  regions  in  its  assignment  of  class  membership.  These 
can  be  visualized  as  boxes  in  n-space.  For  these  reasons,  the 
system  is  called  the  B0X80  system. 
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IV.  A1  non  thnis 


The  design  of  the  B0X80  system  rests  upon  its  classification 
algorithm.  A specialized  data  structure  supports  this  algorithm. 

It  contains  user-provided  pattern  features  and  related  values  from 
which  pattern  class  boundaries  are  defined.  To  produce  this 
structure,  one  of  several  data  representation  algorithms  is  first 
applied  to  the  user  feature  data.  Class  prototypes  are  defined. 

Then  a heuristic  feature  subset  selection  algorithm  is  applied  to 
these  prototypes  to  reduce  the  size  of  the  class  defining  data 
structure.  This  facilitates  microprocessor  implementation  of  the 
classifier.  All  of  these  algorithms  were  tested  individually 
against  various  performance  benchmarks  before  their  implementa- 
tion in  the  B0X80  system.  Then  as  the  system  was  developed,  the 
algorithms  were  exercised  as  system  modules  were  verified. 

Algorithms  for  data  representation,  classification  and  feature 
subset  selection  are  discussed  in  this  chapter.  Related  performance 
benchmarks,  and  testing  procedures  for  system  modules  are  presented 
as  well. 

Data  Representation 

To  allow  comparison  of  histogram  displays  between  classes, 
and  to  enable  byte  sized  component  output  for  microprocessor  use, 
scaling  options  are  provided. 


In  creating  the  B0X80  feature  vector  data  file,  three 
scaling  options  are  provided  to  standardize  the  range  of  component 


variation.  These  simplify  later  data  comparisons.  They  are 
implemented  by  an  energy,  a unitizing,  and  a shifting  transform. 
Each  of  these  scaling  options  maintains  relative  angles  between 
vectors.  However,  vector  magnitudes  vary.  Given  a feature  vector 
Fn  with  components  fn j , these  three  options  produce  a new  vector 
as  follows. 

Energy  normalization: 


Fn  " Fn/e 

(4-1) 

J 2 

where  e = 53  f . 

F\  nj 

(4-2) 

Unit  normalization: 


K ' VU 

(4-3) 

where 

if  | = rr  t ,2)’s 

11  J = 1 nj 

(4-4) 

Shift  normalization: 

Fn  = ”F„  + B 

(4-5) 

where 

m = l/(a+b) 

(4-6) 

in  which  a = max  { f . |n=l,  N,  j=l,  J } 

M J 

b = -min  { fnj.  | n=l , N,  j=l,  J } 


and  N = number  of  vectors  in  the  data  set 
J = dimensionality  of  the  feature  space 


From  the  above,  it  is  clear  that  each  results  from  a 
linear  shift  of  the  original  F . Therefore  relative  angles  be- 
tween the  F^  remain  the  same  as  the  angles  between  the  Fn- 
However,  vector  magnitudes  do  vary.  For  shift  normalization 
there  is  a constant  variation  for  the  entire  set  { Fn } . For  unit 
normalization,  all  vector  magnitudes  collapse  to  unity.  In  energy 
normalization  while  the  energies  of  the  F^  become  unity,  their 
magnitudes  become  less  than  1. 

An  additional  transform  is  provided.  This  'squaring' 
transform  increases  the  precision  possible  in  component  values. 
However,  it  causes  a twisting  of  the  feature  space  which  may 
change  'natural'  relationships.  It  is  provided  as  an  input  trans- 
form for  experimentation  only.  This  transform  standardizes  each 
feature  component  to  the  range  apparent  in  the  data  set.  This 
facilitates  observation  and  measurement  of  data  variation  in  each 
dimension  of  the  feature  space.  Transformed  vectors  are  produced 
as  follows. 

Squaring  transform: 

F;  = FnT_1  + B (4-8) 

in  which  T = diagonal  J x J matrix  of  t.., 

JJ 

where  t..  = (a.  + b.)  for 

J J J J 

a^  = max  { j | n=l , N } 
bj  = -min  { fnj|n=l,  N } 

and  B = (b,,  . . . , b.)  for  b.  as  defined  above, 
i J 
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In  this  transform  both  relative  angles,  and  magnitudes  of  F^  vary 

from  those  of  F . 

n 

Normalization  of  feature  component  values  using  component 
variances  measured  from  the  user  data  set  was  considered  as  a 
possibility.  Since  there  is  some  possibility  that  the  distri- 
butions represented  in  that  data  set  will  not  reflect  those  of 
the  true  population,  this  means  of  normalizing  component  values 
was  not  implemented.  To  cover  the  possibility  that  true  popula- 
tion minimum  and  maximum  values  are  not  represented  in  the  user 
data  base,  the  ranges  (a+b)  referenced  above  can  be  extended  by 
a fractional  proportion  with  little  problem. 

In  the  generation  of  the  microprocessor  data  structure 
which  defines  class  boundaries,  a transformation  is  necessary  to 
map  feature  vector  and  prototype  components  into  an  eight  bit 
range.  Here,  the  squaring  transformation  of  equation  (4-8)  is 
used  since  it  preserves  the  greatest  component  precision.  Since 
class  boundaries  exist  at  this  point,  no  distortion  of  performance 
occurs.  Use  of  this  transformation  implicitly  assumes  that  it 
can  be  embedded  into  an  independent  feature  generation  process 
efficiently.  This  is  a simple  operation  requiring  only  one  add 
and  one  multiply  for  each  feature. 

In  transforming  class  definitions  there  are  two  separate 
algorithms  used.  First,  as  given  in  equation  (4-8), 

Fn  ■ Fn  T'1+  B- 
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Similarly,  for  class  mean  vectors,  known  as  prototypes, 

P!  = Pi  T_1+  B.  (4-9) 

This  prototype  transform  is  readily  derived  at  the  vector  level 
as  follows: 

p:  - f Z f;  (4-10) 

1 L 1=1  * 

L 

= f Z (F  T-1+  B)  (4-11) 

L Z=1 

L- 

= (r  S F.)  T-1+  B (4-12) 

Z=1  * 

= PiT"1+  B (4-13) 

where 

L = the  order  of  class  i 
and  T,  B are  defined  as  in  (4-8) 

The  second  algorithm  operates  on  class  boundaries.  These 
are  established  by  means  of  diagonal  matrices  referenced  to  the 
prototype  vector.  These  matrices  are  explained  in  detail  in  the 
next  section.  To  simplify  this  discussion  of  their  transforma- 
tion, consider  class  boundaries  to  have  been  defined  by  a diagonal 
class  covariance  matrix,  z...  The  transformation  for  this  class 
diagonal  covariance  matrix  is  clearly  understood  at  the  component 
level, 

where  j!..  is  the  jth  component  of  2^ 
p!j  is  the  jth  component  of  Pj 
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is  the  j n component  of 
t..  is  the  j*'*1  member  of  T 

J J 

b-  is  the  j**1  member  of  B 

vl 

f 1 v*  /..i  ft  \2i  h 


5ij  = { r^j  (pij‘  VZ] 


L i=l  zn  \i.i 


£ { t £ (PiJ^V  > 


2 , Js 


w‘> 


2 , 4 


(4-14) 


(4-15) 


(4-16) 


(4-17) 


o ! . = o . ./t  • • 
1J  1J  JJ 


(4-18) 


This  transformation  is  provided  as  an  option  prior  to  the 
calculation  of  classification  error  rates.  The  option,  through 
its  use  of  integer  calculations,  allows  simulation  of  micro- 
processor performance  by  the  B0X80  system.  The  transformation  is 
also  exercised  prior  to  output  of  the  class  defining  data  struc- 
ture in  microprocessor  format.  This  allows  byte  sized  encoding 
of  output  component  values. 


Classification  Algorithm 

A feature  vector  associated  with  an  unknown  pattern  is 
assigned  to  a known  data  class  by  a classification  algorithm. 
The  B0X80  system  classification  algorithm  partitions  hyperspace 


I 


I 

I 

t1 

I 

I 

} 

I 

into  regions  which  can  be  visualized  as  hyperspace  boxes.  Class 

I 

membership  is  derived  from  the  identifier  of  the  hyperspace  box 
which  contains  the  unknov/n  feature  vector.  Since  these  boxes 
need  not  necessarily  be  mutually  exclusive  of  one  another,  the 
containment  property  is  obtained  through  a distance  measurement 
with  which  decision  ambiguities  are  resolved. 

The  B0X80  classification  algorithm  was  designed  to  maxi- 
mize operating  efficiency  within  a microprocessor  implementation. 

Minimum  use  of  memory,  as  required,  reduces  execution  time.  This 
algorithm  was  also  designed  with  the  number  of  feature  dimensions 
and  the  number  of  pattern  classes  as  parameters  of  its  execution. 

Any  combination  of  I classes  and  J feature  dimensions  can  be 
processed  given  that  sufficient  memory  is  available. 

The  algorithm  is  implemented  within  both  of  the  B0X80 
system  segments.  There  are  small  variations  between  these  imple- 
mentations. In  one  instance  the  implementation  is  in  FORTRAN. 

Here,  the  referenced  data  structure  is  a two-dimensional  array 
containing  a collection  of  vectors.  Each  class  is  defined  by 
a set  of  three  of  these  vectors.  Two  options  are  provided  this 
implementation.  One  uses  a Euclidean  norm  for  the  distance 
measurement  rather  than  the  supremum  norm.  The  other  option 
enables  processing  of  scaled  data.  It  substitutes  truncated 
integers  for  real  values  of  referenced  vector  components.  In 
the  second  instance  the  algorithm  has  no  options.  Its  referenced 
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) 
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* 

data  structure  is  a linear  list  partitioned  into  a series  of 
segments,  one  for  each  data  class.  This  instance  occurs  in  the 
micro-processor  based  classifier  routine.  It  is  written  in  the 
assembly  language  for  the  8080.  (Ref  16) 

Memory  requirements  for  data  used  by  the  above  two  imple- 
mentations of  the  B0X80  classifier  are  calculated  in  terms  of 
the  numbers  of  classes  (I)  and  features  (J)  to  be  processed. 

Memory  (M)  required  for  the  Interpreter  Segment's  FORTRAN  data 
structure  is 

M = (J+3)  (2I+K)  (4-19) 

Memory  required  for  the  8080  Classifier  Segment  implementation 
is  calculated 

M = [ (3J)+1] (I ) (4-20) 

The  FORTRAN  implementation  references  a data  structure  in  which 
vector  dimensionality  has  been  increased  by  three  extra  values. 

This  produces  the  factor  (J+3).  The  factor  K indicates  the  number 
of  classes  having  asymmetric  boundaries.  This  differs  with  the 
8080  implementation  which  adds  only  one  extra  value,  a class 
identifier,  to  each  class.  This  implementation  assumes  that  each 
class  has  asymmetric  boundaries. 

The  algorithm  implements  a variation  of  the  minimum  dis- 
tance classification  rule.  An  unknown  vector  is  assigned  member- 
ship in  that  class  to  which  it  is  nearest.  However,  this  algorithm 
exhibits  facets  of  other  common  classifier  algorithms.  From  the 
perspective  that  the  algorithm  references  the  multivariate 
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covariance  of  each  class'  features,  it  can  be  considered  a 
variant  of  a Bayesian  decision  rule.  However,  no  formulation  of 
the  a priori  probability  of  class  membership  is  made.  Furthermore, 
feature  dimensions  must  be  assumed  to  present  uncorrelated,  inde- 
pendent measurements  of  pattern  variation.  Finally,  these  feature 
measurements  must  be  assumed  to  be  completely  representative  of 
pattern  class  membership  and  must  be  assumed  to  generate  Gaussian 
distributions.  Therefore,  although  the  algorithm  has  a statisti- 
cal flavor,  it  is  not  a true  Bayesian  algorithm.  However,  from 
the  standpoint  that  its  referenced  data  structure  partitions  the 
feature  space  into  a collection  of  hyperspace  boxes  each  of  which 
bounds  a neighborhood  of  a given  class,  it  can  be  considered  a 
variant  of  a condensed  nearest  neighbor  rule.  This  perspective 
is  justified  by  the  fact  that  each  class  boundary  is  statistically 
constructed  so  as  to  enclose  an  advantageous  subset  of  class 
members.  Here,  in  discriminating  between  classes  to  produce  the 
classification  assignment,  the  evaluation  of  distances  to  class 
boundaries  is  analogous  to  evaluation  of  distances  to  the  nearest 
neighbors  of  the  unknown  pattern.  The  weakness  in  this  comparison 
lies  in  the  fact  that  the  B0X80  algorithm  tends  to  benefit  from 
convex  class  boundaries.  The  HN  algorithm  needs  no  such  assump- 
tion. 

The  data  structure  which  establishes  each  class’  boundaries 
consists  of  a vector  and  a pair  of  diagonal  matrices.  The  vector 
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is  a class  mean  or  prototype  vector.  For  class  i consisting  of 
a set  lu..  of  feature  vectors  of  dimensionality  J,  this  proto- 
type vector  is 


D 

1 i 


(4-21) 


The  two  diagonal  matrices  establish  class  boundaries  in  terms  of 
component  variation  from  this  mean.  These  matrices  are  most 
clearly  defined  at  the  component  level.  Consider  a class  of 
feature  vectors  represented  by  L members  of  dimensionality  J.  A 
feature  vector  within  is 


F*  = • ' * fJij*  * * ' W 


and  the  prototype  vector  for  the  class  is 


(4-22) 


Pi  = (Pil’  ‘ ‘ Pij’  * ’ ‘ PiJJ  (4-23) 

The  diagonal  matrix  which  establishes  boundaries  less  than  this 
prototype  is 


J x 0 


(4-24) 


The  diagonal  matrix  which  establishes  boundaries  greater  than  the 
prototype  is  similarly  represented 

z+i  = tzT).  (4-25) 

Note  that  the  subscripts  of  matrix  components  do  not  reflect 
membership  in  class  i.  This  is  simply  a convenient  notation. 


These  components  are  formed  as  follows. 


iff  f*j  > Pij’  Zjj  [ L 1 


(4-26) 

(4-27) 


In  defining  a class  in  terms  of  a class  mean  vector  and  two 
boundary  matrices,  a minimum  Euclidean  distance  algorithm  can  be 
constructed.  However,  a scaled  distance  measurement  is  used  here. 
That  is,  the  distance  of  an  unknown  vector  from  a class  prototype 
will  be  measured  in  each  component  dimension  in  terms  of  a number 
of  boundary  units.  This  is  a distance  measure  similar  to  the 
Mahalanobis  distance.  Given  uncorrelated  features,  and  using  the 
simplying  assumption  made  for  equations  (4-14)  to  (4-18) 


MFn 


J 

e w ■ ) > 11 

1 j = l 


a-Cj2). 


(4-28) 


Where  the  features  are  correlated,  this  probability  can  be 
wri tten 

J 7 

PJF  c to.)  > max  (°,  (i-  L o/))  (4-29) 

r n 1 ” j j=l  J 


These  bounds  are  derived  from  Tchebychef's  inequality  by  Godwin 
(Ref  12:63). 

To  assign  class  membership  to  an  arbitrary  feature  vector 
Fn  with  components  f j , first  a composite  boundary  matrix,  Z1,  is 
formed  for  each  class  i.  This  produces 
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(4-30) 


z1  ■ 

In  this  composite  boundary  matrix 


iff  f.  > p.. 

then 

z<!>  . z+. 

(4-31) 

J iJ 

JJ  JJ 

and 

W fj  < PjJ 

then 

z!l>  = Z~  ■ 

JJ  JJ* 

(4-32) 

Distance  from  an  unknown  Fn  to  this  class  is  next  computed,  first 
as  a vector  and  then  as  a scalar.  This  effects  a classifying 
decision  rule  as  follows 


Din  ’ <Pi  - F„>  Z’ 


din 


l°inl 


The  scalar  d^n  is  considered  a member  of  the  set 


A {d.,...d1 


• * dIn}' 


in'  ’ - ’ Jln ’ ‘ 

Class  membership  is  then  assigned  to  that  class  to  which 
distance  is  minimum.  That  is 

dk  = {din}  - Fn  e V 


(4-33) 

(4-34) 

(4-35) 


(4-36) 


Several  notes  about  this  algorithm  are  worthwhile.  The 
two-sided  approach  to  defining  class  boundaries  was  suggested  by 
Pacheco  (Ref  32:11)  in  the  course  of  a review  of  the  radar  signa- 
ture recognizer  described  in  Chapter  2.  The  simpler  process 
which  uses  a single  boundary  matrix  to  define  both  sides  of  a 
symmetric  hyperspace  boundary  for  a class  can  be  described  as  a 
minimum  distance  classifier  having  a Mahalanobis’  distance  metric. 
The  assumption  that  feature  dimensions  are  uncorrelated  and 
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therefore  independent  allows  the  composite  boundary  matrix,  Z1 , to 
be  considered  as  a diagonal  covariance  matrix,  X • . In  this  case 

i.  L 

the  distance  measurement  to  the  i class  can  be  written. 

4 * <F„-  Pi)  <F„-pi)T-  (4-37) 

The  equivalence  of  this  expression  to  equation  (4-33)  is  readily 
seen  in  a simple  example.  Let  dimensionality  J=2,  and 


X = 

where  X = 
Let  x-1  = 


nl’ 


l/Cji2 

0 


Pi2-fn2^ 
0 1 


J x J = 2 x 2 


(4-38) 

(4-39) 

(4-40) 


"here  a,,  = (f  t (IW2 1 

In  this  example  it  is  notationally  clear  that 


[ X ^ 5 X2 1 

1/oJj  0 

'xr 

.0  l/agg. 

-x2  - 

(4-41) 


(4-42) 


From  the  rules  of  matrix  algebra,  this  is 


d*~  2 2 

in  _ x2^°22^ 


(4-43) 


which  is 

din  = xl2/oll  + x22/o22*  (4-44) 

Equation  (4-44)  defines  the  square  of  the  Eucledian  norm  in  two 
space.  Thus  one  sees  that 
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IRA 


(4-45) 


d 

in 


where  Din  = [x^  x2] 


1/a 


11 


0 


0 1/a. 
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which  is 

Di„  = <pi  - F„>  zi-  <4-47) 

In  this  way  the  equivalence  of  equations  (4-37)  and  (4-33)  has 
been  demonstrated. 

The  foregoing  presentation  of  the  B0X80  classification 
algorithm  avoids  one  issue  and  glosses  over  another.  The  former 
is  a programmatic  statement  of  the  actual  algorithm  which  refer- 
ences the  defined  computer  data  structure.  This  is  presented  at 
the  close  of  this  chapter.  The  latter  is  the  derivation  of  the 
B0X80  nomenclature.  This  explanation  follows. 

(1)  The  J dimensional  region  defined  by  equation  (4-37)  forms 
an  ellipsoid  in  hyperspace  whose  shape  is  specified  by  (Ref  13:36). 
This  ellipsoid  has  its  axes  oriented  along  the  axes  of  the  space 
since  l-  is  diagonal. 

(2)  The  J dimensional  region  defined  by  equation  (4-33) 
forms  a hyperrectangle  about  the  prototype  vector,  P. . This 
results  from  a computationally  simplifying  norm  used  to  produce 
the  magnitude  of  D^.  This  norm  is  defined  as  follows 

llDinll  ■ sup  (xi,  . . . Xj,  . . . Xj)  (4-48) 

where  Dip  = (P^-F^  = (x1 , . . . Xj , . . . Xj)  (4-49) 
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The  rectangular  aspect  of  these  class  regions,  from  the 


sup  norm  distance  metric,  becomes  clear  in  this  figure. 
Classification  Algorithm: 

1.  procedure  CLASS  (FEAT(I,L),IC) 

2.  begin 

3.  set  DMIN  = 1E10 

4.  for  all  classes  I do 

5.  begin 

6.  set  DMAX : = -1E10 

7.  set  NCAV  (to  index  class  I , P.. ) 

8.  set  NCSDL  (to  index  class  I,  Z-1) 

9.  set  NCSDR  (to  index  class  I,  Z+1) 

10.  vf  NCSDR  eq  0 then 

11.  set  NCSDR:  = NCSDL 

12.  for  all  dimensions  J <ta 

13.  begin 

14.  set  NCSD: NCSDL 

15.  set.  DFEAT : =CLAS ( J , NCAV ) - FEAT  ( J , L ) 


16.  vf  DFEAT  gt  0.0  then 

17.  set  NCSD:=NCSDR 

18.  set  DFEAT :=  DFEAT/CLAS( J ,NCSD) 

19.  if  ABS( DFEAT)  gt  DMAX  then 

20.  set  DMAX :=ABS( DFEAT) 


21.  end  'J' 


T 
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I 

I 

, 

22.  if  DMAX  It  DM IN  then 

23.  set  I C : = I 

24.  end  "I" 

25.  "B0X80  CLASSIFIER" 

26.  end  "CLASS" 

Feature  Selection 

Good  features  make  good  pattern  recognizers.  Unnecessary 
pattern  features  make  inefficient  pattern  recognizers.  Thus, 
identifying  the  best  features  is  important  to  developing  an  accept- 
able pattern  recognizer. 

The  literature  reflects  considerable  work  done  to  solve  the 
general  problem  of  identifying  features.  This  problem  may  be 
approached  in  one  of  three  ways.  Firstly,  one  may  rely  upon 
analytical  theory  to  identify  just  the  set  of  features  which  should 
be  extracted  from  the  pattern  environment.  However,  theory  does 
not  always  identify  a set  of  measurements  which  suffice  to  com- 
pletely classify  a pattern  environment.  In  another  technique,  one 
may  compute  a large  set  of  candidate  features  and  then  rely  on 
transforms  and  filters  prior  to  classification  to  generate  a 
smaller  set  of  significant  factors.  In  a third  method,  one  may 
evaluate  a candidate  set  of  features  in  the  light  of  a classifi- 
cation algorithm,  and  preselect  the  most  desirable  subset.  The 
recognizer  then  operates  directly  on  this  subset  of  features 
without  added  processing. 


r 
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An  assumption  underlying  this  thesis  has  been  that  con- 
venience and  efficiency  are  more  critical  factors  in  developing 
an  initial  recognition  model  than  a proven  optimality  or  a com- 
prehensive analytical  basis.  Stearns  (Ref  34:71)  notes  that 
from  the  standpoint  of  hardware,  reducing  the  original  set  of 
measurements  by  principal  component  analysis  and  transforms 
may  even  produce  a loss  in  overall  system  performance.  His  argu- 
ment allows  that  when  a transform  to  a subspace  is  effected, 
all  features  of  the  original  space  have  to  have  been  generated. 
Thus,  even  though  subsequent  processing  may  benefit  by  reduced 
subspace  dimensions,  the  computational  costs  of  feature  extrac- 
tion must  still  be  carried.  This  argument  led  to  the  develop- 
ment of  a subset  selection  algorithm  to  implement  the  B0X80 
system. 

Prior  selection  of  an  acceptable  subset  of  features  has 
advantages  for  microprocessor  implementations  of  distributed 
pattern  recognition  systems.  In  such  a system  the  master  processor 
can  be  used  to  extract  features  from  the  environment.  Its  feature 
extraction  software  may  initially  be  coded  to  generate  many 
feasible  and  reasonable  pattern  characteristics.  The  slave 
processor  can  be  used  to  execute  a pattern  classifier  and  produce 
recognition  decisions.  Once  a subset  of  features  has  been 
selected  by  a process  such  as  that  supported  by  the  B0X80  Inter- 
preter Segment,  the  feature  extraction  algorithm  can  be  stream- 
lined by  straightforward  deletion  of  extraneous  computations. 


The  result  is  a process  which  uses  less  time.  Then  the  new  class 
defining  data  structure  is  provided  to  the  classifier  and  another 
data-gathering,  recognizer  evaluation  cycle  can  begin. 

Search  algorithms  for  finding  a better  subset  of  features 
have  two  common  elements  as  described  in  chapter  II.  Estimation 
of  error  probability  by  calculation  of  an  empirical  error  rate 
is  the  best  evaluation  for  any  feature  subset.  Before  a subset 
can  be  evaluated,  it  must  be  constructed  by  a mapping  from  the 
original  feature  set.  The  B0X80  system  does  not  implement  a 
search  algorithm.  Instead,  the  search  iteration  is  opened  to 
the  user.  Thus,  the  user  can  specify  the  mapping  which  creates 
the  subset  to  be  tested.  He  can  also  control  the  search  itera- 
tion by  his  evaluation  of  the  empirical  error  rate  which  applies 
to  the  subset  of  interest. 

In  order  to  guide  the  user  towards  selection  of  trial  sub- 
sets of  features,  a figure  of  merit  is  calculated  for  each  feature. 
This  figure  of  merit  reflects  the  contribution  that  its  associated 
feature  makes  to  the  recognition  decision.  To  establish  this 
contribution  a set  of  interclass  distance  vectors  are  computed. 
Combinations  of  these  vectors  produce  diagonal  matrices  whose 
components  are  the  figures  of  merit  for  their  respective  feature 
dimensions.  Three  different  matrices  are  computed  based  upon 
the  distance  measurement  of  equation  (4-33).  In  this  case,  a 
prototype  vector  representing  an  'unknown'  class  is  substituted 
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for  the  unknown  feature  vector  of  the  original  equation.  Two  sets 
of  matrices  { n } and  {D^}  are  computed  for  each  class  as  follows. 

v <pi  - p„>  z1  i*-53’ 

and  Xn,  * (Pn  - P1 ) z"  (4-54) 

wh ere  1 £ i <_  I, 

1 < n < I 

and  matrices  Z1  and  Zn  are  established  as  for  equation  (4-33).  A 

diagonal  matrix  is  constructed  from  each  of  the  vectors  X^n  and 

Xni  simply  by  considering  the  vector  components  as  the  appropriate 

members  of  the  matrix1  diagonal.  Thus 

D.  <-  X- 
in  in 

and  Dn,  * Xn1  . 

The  set  of  matrices  {D.jn|l  £ n < 1}  establish  the  distances  from 
class  i to  each  of  the  other  class  prototypes  in  the  data  structure. 
Components  of  these  diagonal  matrices  are  measured  in  the  boundary 
units  of  class  I.  On  the  other  hand,  the  set  matrices  {Dn.j|l  i £ 1} 
reflect  the  opposing  distances  to  class  i from  each  of  the  other 
class  prototypes  in  the  data  structure.  Components  of  these 
diagonal  matrices  are  measured  in  the  boundary  units  of  each  of 
the  "other"  classes.  Opposing  matrices  D^n  and  are  rarely  the 
same  which  indicates  that  this  distance  measurement  does  not  form  a 
metric  on  the  discrete  space  of  prototype  vectors. 

A series  of  experiments  was  used  to  evaluate  these  interclass 
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distances.  One  set  of  merit  figures  resulted  from  each  experiment. 
A measure  of  the  'volume'  of  each  class  was  sought.  Three  were 
produced.  For  the  first  (subscripted  3 below  to  match  program 
code),  a distance  matrix  was  calculated  for  each  class, 


).™n. 


(4-55) 


Then  a merit  matrix  for  the  feature  space  was  derived  from  these 


m3  - £ V,  . 
J i = l 1 


(4-56) 


The  components  of  this  diagonal  matrix  became  figures  of  merit 
for  their  respective  feature  dimensions. 

Each  component  of  the  diagonal  matrix,  M^,  is  related  to 
the  total  interclass  distance  in  its  dimension.  Experimentation 
with  these  component  values  as  merit  figures  led  to  the  realiza- 
tion that  overlap  between  a pair  of  classes  in  a given  dimension 
was  not  as  well  reflected  in  this  figure  of  merit  as  possible. 
This  can  be  seen  in  a numerical  example.  Let 


Vj  = 16.0 


0 and  V0  = 1.0  0 

0 °‘5 

.0  u . 9.0 


for  a 2 class,  3 dimension  instance.  Note  that  in  this  case 
^1  ” Din‘ 

Here  the  merit  matrix  is 


">■  8 


17. 0J 


T 
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and  the  differences  among  the  feature  merits,  m..,  are  not  appre- 

J J 

ciable.  However,  the  components  of  the  postulated  show  that 
in  feature  dimension  3 the  classes  are  almost  equally  separated 
at  large  distances  of  8.0  and  9.0  boundary  units.  Therefore,  the 
classes  are  readily  separable  in  this  dimension.  This  is  clear 
from  the  operation  of  the  classifier  algorithm  which  computes  for 
this  dimension, 

A = ( d ^ 3 , • • • d-j3»  . • « dj^l  (4—57) 


in  which 


di3  ” (p13  ‘ p23^z33^' 


(4-58) 


Allowing  that  z3p  <v>  for  symmetric  classes,  and  considering 
each  class  in  turn  as  the  unknown, 

(p13  " p23^  = 8-0  °33^ 

(2) 

and  (p23  - p13)  = 9.0  o33  . 

In  a Tchebyshev  sense  there  is  little  likelihood  of  confusion 
between  the  two  classes  in  dimension  3.  However,  similar  compu- 
tations for  dimension  1 indicate 
( P 1 1 " P21 ) = 


and  (P21  - pn)  = 1 0 


(2) 

'11  * 


Here,  in  the  same  sense,  the  likelihood  of  confusion  between  classes 
is  great.  A similar  condition  exists  to  an  even  greater  degree  in 
dimension  2.  To  rectify  this  situation  another  set  of  merit 
figurer  was  computed  as  follows. 
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(4-59) 


I 

= II 

c L= 

In  this  case,  the  components  of  the  diagonal  matrix  M2  are  more 
sensitive  to  the  appearance  of  a small  component  within  some 
matrix  . Using  the  matrices  of  the  previous  example  this 
M2  matrix  is 


’16.0 

0.0 

0.0" 

m2  - 

0.0 

8.0 

0.0 

. 0.0 

0.0 

72.0. 

Here  there  is  clear  indication  of  the  strength  of  feature  3.  The 
appearance  of  the  relatively  small  values  8.0  and  16.0  at  com- 
ponents riij 2 » m22  c ^2  1nd''cate  that  in  these  features  many  classes 
are  relatively  close  to  one  another.  However  a given  feature  may 
discriminate  well  between  all  but  one  class.  This  instance  is 
not  reflected  well  by  the  components  of  Mg.  Thus  a third  merit 
matrix  was  generated.  This  is 
I I 

Mi  = £ In  ( II  D.  ),  n jM . (4-60) 

1 i=l  m=l  1n 

This  formulation  differs  from  the  earlier  ones  in  the  use  of  a 
logarithmic  sum,  and  in  the  use  of  matrices  Dir|,  only.  Explana- 
tion follows. 

(1)  The  logarithmic  sum  produces  merit  figures  which  form 
the  same  ordered  sequence  by  magnitude  as  the  merit  figures 
produced  by  the  matrix  product. 
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I 

\ 

I 

f 

1 

, I I 

M.  = n ( n D .),  n f i (4-61)  ! 

1 i = l n=l  in 

l 

Hov/ever  the  values  of  the  logarithmic  sum  are  not  nearly  so 
likely  to  overflow  the  floating  point  limit  of  the  computer.  It 

was  hypothesized  that  this  matrix  product  formulation  would  reflect  < 

dimensions  having  single  class  confusion  by  a greater  variation  in 
its  components  than  there  would  exist  among  components  of  M2.  (The 
notion  was  that  in  the  double  product,  components  would  change 
geometrically,  while  in  the  sum  of  products  they  would  vary  arithme- 
tically). Testing  with  merit  figures  from  Mr,  to  M^  is  reported 
in  the  next  section.  Some  experimenting,  in  a three  class  problem, 

I 

was  done  with  the  M1  figures  of  merit.  These  appeared  more  robust 
than  M2  figures.  However,  the  26-class  alphabet  problem  created 

I 

overflow  in  the  Mj  matrix.  The  Mj  merit  figures,  as  can  be  seen 

I 

in  the  next  section,  do  not  reflect  the  robustness  of  the  Mj  figures. 

(2)  Merit  matrix  M^  is  formed  from  matrices  D^n  only, 
since  this  produces  results  equivalent  to  those  obtained  with  the 
matrix  sum  (D^n  + Dn^)  as  for  matrix  M2j.  This  is  because 
I I 

n (D.)  = n ( D • ) , i=n  (4-62) 

1-1  in  n=l 

These  procedures  for  establ isliing  merit  figures  for  feature 
dimensions  have  a similar  basis  to  those  of  Michael  and  Lin  (Ref  28:172). 

They  produce  a means  of  ordering  features  in  terms  of  capacity  to 
discriminate  between  classes.  They  are  intended  only  as  a starting 


53 


I 

) 


point  for  a heuristic,  manually  controlled  search  for  a good  sub- 
set of  features. 

To  establish  subsets  of  features,  the  80X80  system  uses  a 
mapping  algorithm  which  maps  the  original  feature  space  into  a 
subspace.  This  mapping  process  references  an  ordered  list  of 
feature  dimension  tags.  Each  tag  is  the  number  of  a feature  in 
the  original  space.  An  ordering  may  be  constructed  by  sorting 
these  feature  tags  by  their  respective  merit  figures.  An  arbi- 
trary order  may  also  be  manually  input.  The  mapping  algorithm 
is  imbedded  in  a routine  which  computes  error  rates  for  0 differ- 
ent subspaces.  These  error  rates  can  be  generated  during  a 
single  iteration  of  the  trial  classifier.  The  process  of  con- 
structing tentative  feature  subspaces  is  thus  piggybacked  onto 
the  B0X80  performance  evaluation  function. 

Subspaces  constructed  by  the  mapping  algorithm  are  based 
on  a nesting  of  proper  subsets  of  features.  These  subsets  con- 
tain an  increasing  number  of  features  from  1 to  J.  Each  subset 
is  contained  by  its  successor. 

In  the  classification  procedure  described  earlier,  a 


distance  vector  is  calculated.  This  is 

D.  = (P.  - F ) Z1 
in  ' i n' 


(4-63) 


The  mapping  algorithm  operates  on  the  components  of  this  vector 
to  produce  a set  of  J nested  subsets,  S..  An  error  rate  is  com- 

J 

puted  for  each  of  these.  An  example  may  clarify  the  process. 


Let  J=3  and  (3,1,2)  be  a list  of  feature  tags  ordered  by  figure 
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of  merit.  Let  the  distance  vector 


) 

V 

I 


Din  = (14.0,  63.1,  9.0). 

Here,  the  mapping  algorithm  constructs 

S1  C s2  C S3 

(9.0)  (9.0,  14.0)  (9.0,  14.0,  63.1) 

as  the  set  of  nested  subsets.  Each  of  these  is  considered  a distance 
vector  in  its  respective  subspace  of  the  original  three-dimensional 
space.  The  decision  rule  is  operated  on  each  of  these  vectors  at 
once.  This  is  the  key  point.  Rather  than  operate  the  decision  rule 
on  each  vector  in  series,  these  nested  vectors  are  processed  in 
parallel.  Since  the  max  and  min  functions  which  implement  the 
decision  rule  can  be  done  in  a parallel  fashion,  some  execution  cost 
is  saved.  Thus,  for  each  j,  lyj<J, 

djk  = Min  { M S j I I ’ iyi1 ) sj  e 

and  a class  assignment  is  obtained  and  an  error  rate  is  computed  for 
each  subspace. 

Finally,  a special  procedure,  termed  a zapping  process,  is  used 
to  modify  the  tentative  class  definition  structure  to  establish  a chosen 
subspace  as  the  basis  for  future  trial  recognition  experiments. 

In  this  process,  selected  components  of  all  members  of  the  set  of 
Z.j+  and  Z.~  matrices  (which  reflect  class  boundaries)  are  increased 
to  large  values  in  each  matrix.  The  effect  is  to  nullify  all  measure- 
ments made  in  those  dimensions. 

The  algorithm  used  for  computation  of  merit  figures,  and 
the  algorithm  used  to  map  and  evaluate  feature  subspaces  are 
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presented  in  the  following  two  paragraphs.  The  former  is  titled 


MERIT 

. The  latter  is  termed  LOOK. 

Algorithm  for  Mapping  and  Subspace  Evaluation 

1. 

Procedure  L00K[DIS(J) ,ITAG(J) ,RATE(J) , NEW, KNOW, 

2.  1 

begin 

3. 

if  NEW  eq  1 then 

4. 

begin 

5. 

for  all  J do 

6. 

begin 

7. 

set  CLOSE ( J ) = 1E9 

8. 

set  ISAV(J)  = 0 

9. 

end 

10. 

end 

11. 

for  all  J do 

12. 

begin 

13. 

set  K = ITAG(J) 

14. 

set  WORK(J)  = DIS(K) 

15. 

end  "J" 

16. 

for  all  J do 

17. 

begin 

18. 

set  RMAG  = -1E20 

19. 

for  K from  1 to  J do 

20. 

begin 

21. 

if  WORK(K)  cje  RMAG  then 
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22. 


set  RMAG  = WORK(K) 


23.  end  "K" 

24.  if  RMAG  1e  CLOSE(J)  then 

25.  begin 

26.  set  IPICK(J)  = I 

27.  set  CLCSE(J)  = RHAG 

28.  end 

29.  i_f  NEW  e£  2 then 

30.  if  IPICK(J)  eq  KNOW  then 

31.  set  RATE(J)  = RATE(J)  + 1. 

32.  end 

33.  end 

34.  end 


Algorithm  for  Figures  of  Merit: 

1.  procedure  MERIT  rCLAS(J.I) ,FT(J.5)l 

2.  begin 

3.  for  all  J (to 

4.  begin 

5.  set  FT(J,1):  = FT ( J , 2 ) : = FT(J,4): 

6.  set  FT ( J , 3 ) : = 0.0 

7.  end  "J" 

8.  for  all  I do 

9.  begin 


FT(J,5):  = 1.0 
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10. 


11. 

12. 

13. 

14. 

15. 


16. 


17. 


18. 


19. 


2C. 


21. 


22. 


23. 

24. 

25. 

26. 

27. 

28. 

29. 

30. 


31. 


32. 


33. 


set  ICAV  (to  index  CLASS  I,  P . ) 
set  ICSDL  (to  index  CLASS  I, 
set  ICSDR  (to  index  CLASS  I,  Z.+*) 
if  ICSDR  eq  0 then 
set  ICSDR:  - ICSDL 
for  all  N except  N=I  do 
begin 

set  NCAV  (to  index  CLASS  N,  P ) 
set  NCSDL  (to  index  CLASS  N,  Zn"n) 
set  NCSDR  (to  index  CLASS  N,  Zn+n) 
i f NCSDR  eq  0 then 
set  NCSDR:  = NCSDL 
for  all  J do 
begin 

i_f  J eq  1 then  begin 
set  FT(J,4)  = 1.0 
set  FT(J,5)  = 0.0  end 
set  DI (J ) : CLAS(J, ICAV )-CLAS(J, NCAV) 
set  DN ( J ) = DI (J) 
set  ICSD:  = ICSDL 

set  NCSD  = NCSDL 
if  DI ( J ) It  0 then  begin 
set  ICSD:  = ICSDR  else 
set  NCSD:  = NCSDR  end 
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34. 


set  DI(J):=  DI (J )/CLAS( J , ICSD) 
set  DN(J) DN(J)/CLAS(J ,NCSD) 
set  FT(J,3):=  FT(J,3)+DI(J)+DN(J) 
set  FT(J ,4) :=  FT(J,4)*DI(J) 


set  FT(J ,5) :=  FT ( J , 5 ) +D I ( J ) +DN ( J ) 


end  "J‘ 


end  "N" 
for  all  J do 


begin 


set  FT(J,1):=FT(J,1)+Ln  (FT(J ,4) ) 


set  FT(J,2) :=FT(J,2)*FT(J,5) 


end  "J' 


end  "I1 


47.  end  "Merit" 


FT ( J » 3 ) contains  figures  of  merit  M3 
FT(J,4)  contains  figures  of  merit  M^ 
FT(J,5)  contains  figures  of  merit  M2 


M.  - £ In  ( n Cin>-  n * 1 
1 i=l  n=l 


V " ‘<£  <D1n  + n >■  1 

i-i  n=l 


M3  = E [(E  (Din  + D .)],  n t 1 
J i=l  n=l  in  n1 


Performance  Benchmarks 


The  B0X80  system  is  a designer's  tool.  It  is  intended  for 
student  use  in  development  of  experimental  pattern  recognition 
systems.  It  produces  a class-defining  data  structure  upon  which 
a microprocessor  based  pattern  classifier  can  operate.  B0X80 
system  performance  is  relfected  in  the  error  rate  of  its  classi- 
fier. This  error  rate  is  heavily  dependent  upon  the  nature  of 
the  data  set  from  which  the  class  defining  data  structure  is 
derived.  However,  the  B0X80  system's  algorithms  and  procedures 
do  contribute  to  this  performance.  No  argument  is  made  here 
that  these  algorithms  are  optimum.  Nor  is  it  claimed  that  B0X80 
system  procedures  are  uniquely  effective.  Nevertheless,  these 
algorithms  and  procedures  are  sufficient  to  generate  class  defining 
data  structures  efficiently  and  effectively.  These  claims  are 
supported  by  the  discussion  following. 

System  Efficiency.  Here,  the  cost-benefit  trade-off  is 
critical.  It  makes  no  sense  to  me  to  optimize  a classifier  algor- 
ithm on  the  basis  of  a data  set,  however  extensive,  which  cannot 
be  proven  optimal.  In  the  recognition  of  electromagnetic  patterns, 
sample  data  collection  is  biased  almost  by  definition.  Sensor 
locations  may  be  constrained;  hardware  transients  may  be  unpre- 
dictable; the  pattern  environment  may  even  be  simulated.  The  B0X80 
system  is  configured  to  provide  a low  cost  avenue  towards  the 
necessary  class  defining  data  structure.  Finally,  the  B0X80 
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classifier  itself  is  configured  for  low-cost  microprocessor 
implementation. 

(1)  In  generating  a class  defining  data  structure,  the 
B0X80  system  uses  a system  segment  of  four  programs.  These  programs 
optimize  memory  use  with  generalized  data  structures  and  a memory 
allocation  module.  They  communicate  through  standard  system  data 
files.  These  files  and  program  source  code  conform  to  ANSI 
standards.  Program  structure  is  modular.  Design  conforms  to  top- 
down  concepts.  As  a result,  this  system  segment  is  transportable, 
and  readily  modifiable.  Since  it  can  be  readily  configured  for 

use  on  any  minicomputer  or  large-scale  system,  it  is  a low  cost 
tool  for  use  in  pattern  recognizer  development.  The  efficiency 
of  the  individual  programs  in  this  segment  is  not  as  critical  as 
the  above  general  cost  of  using  the  system.  Yet,  in  the  alphabet 
classification  experiment  discussed  in  this  section,  the  trial 
classification  process  required  less  than  55K  of  CDC6600  memory 
and  executed  in  less  than  23  cpu  seconds.  This  contrasts  to  the 
similar  costs  of  140K  memory  and  41  cpu  seconds  for  the  specialized 
alphabet  classifier  program  which  provided  comparison  data. 

(2)  The  classifier  segment  of  the  system  uses  less  than 

256  bytes  of  microprocessor  ROM.  The  class  defining  data  structure, 
of  course,  uses  RAM  memory  in  relation  to  its  size  as  specified 
in  equation  (4-20).  No  actual  timing  of  the  execution  of  this 
segment  has  been  performed.  To  some  extent  this  timing  is  problem- 
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dependent.  That  is,  the  total  time  required  to  iterate  through 
the  data  structure  for  a given  problem  depends  on  the  numbers  of 
classes  and  features  for  that  problem.  In  addition,  the  very 
simplicity  of  this  algorithm  indicates  a speedy  execution. 

System  Effectiveness.  Here,  the  contribution  to  performance 
of  system  algorithms  and  procedures  is  addressed.  The  classifier 
algorithm  operates  with  an  error  rate  within  reasonable  limits  of 
that  produced  by  a comparable  algorithm  on  each  of  two  data  sets. 
Similarly,  the  algorithm  which  evaluates  feature  merit  establishes 
merit  figures  which  match,  within  limits,  the  merit  figures 
established  by  other  such  algorithms  on  these  data  sets.  Finally, 
the  procedures  for  selection  of  a feature  subset,  and  for  genera- 
tion of  the  class  defining  data  structure  for  a microprocessor, 
successfully  reduce  data  structure  size  without  increasing  the 
classifier  error  rate  significantly.  These  aspects  of  system  per- 
formance are  detailed  in  the  following  paragraphs. 

Previous  thesis  work  at  AFIT  produced  the  two  data  sets 
with  which  B0X80  system  performance  has  been  evaluated  (Refs  33, 
24).  Performance  benchmarks  were  established  for  each  data  set. 
B0X80  system  algorithms  were  analyzed  in  terms  of  these  bench- 
marks both  during  design  and  after  implementation.  This  analysis 
follows. 

(1)  Table  I and  Figs  5 to  16  apply  to  Frequency  of 
Occurrence  of  Binary  Words  data.  This  data  consists  of  some  500 
feature  vectors  of  14  components  which  represent  patterns  from  a 
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three-class  recognition  problem.  Both  the  Online  Pattern  Analysis 
and  Recognition  System  (OLPARS)  (Ref  5)  and  the  Statistical  Package 
for  the  Social  Sciences  (Ref  30)  were  used  to  establish  error 
rates  for  the  classification  of  this  data. 

(a)  As  in  other  radar  pattern  recognition  problems, 
the  features  in  this  data  set  are  highly  correlated.  Table  I 
presents  Pearson  Correlation  coefficients.  These  represent  an 

1 

index  of  the  degree  of  linear  relationship  betv/een  the  features. 

(Ref  27).  As  can  be  seen,  fewer  than  twenty  percent  of  the 
meaningful  correlations  are  less  than  .70.  Note  that  only  one  of 

t 

these  is  less  than  .50  and  that  nearly  half  of  these  associate 
with  feature  6. 

(b)  Using  a Mahalanobis'  distance  based  discriminant 
analysis  procedure  (DISCRIMINANT),  SPSS  produced  an  overall 
classifier  error  rate  of  26.6  percent.  (See  Fig  5.)  The  OLPARS 

Jt 

system  also  processed  this  data.  With  the  same  statistical  measure, 
its  nearest  mean  vector  procedure  (NMV)  produced  an  error  rate  of 
27.7  percent.  (See  Fig  6.)  The  B0X80  system  error  rate,  34.5 
percent,  is  shown  in  Fig  7.  To  interpret  this  figure,  notice 
that  the  summary  conclusion  values  are  a percent  correctly  classi- 
fied, a percent  classified  in  error,  and  a percent  rejected. 

Rows  of  the  B0X80  confusion  matrix  contain  a count  of  data 
vectors  belonging  to  the  class,  the  class  id,  and  standard  confu- 
sion matrix  assignment  percentages.  Other  data  output  is  discussed 
in  chapter  5.  Note  that  SPSS  and  OLPARS  algorithms  use  a process 
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CENTROIDS  OF  CROUPS  IN  REDUCED  SPACE 

CROUP  1 -1.40581  -.00974 
CROUP  2 1.09814  -.52885 
CROUP  3 .99107  .23744 


DISCRIM  FORM  DATA 


PREDICTION  RESULTS  - 


ACTUAL  CROUP 

NAME  CODE 

N OF 
CASES 

PREDICTED  CROUP  1IEMDERSHIP 

CROUP  1 GROUP  2 CROUP  3 

CROUP  1 

1 

198 

188. 

10. 

22. 

83.8  PCT 

5.1  PCT 

11.1  PCT 

CROUP  2 

2 

82 

2. 

54. 

28. 

2.4  PCT 

85.9  PCT 

31.7  PCT 

CROUP  3 

3 

190 

10. 

55. 

125. 

5.3  PCT 

28.9  PCT 

85.8  PCT 

73.4  PERCENT  OF  KNOWN  CASES  CORRECTLY  CLASSIFIED 

FIG  5.  SPSS  DISCRIMINANT  RESULTS 
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Overall  Evaluation* 

Dataset  kdigroms  tXXt  passed  against  logic  designed  on  firshalf 
Number  of  dimensions  ■ 14 


true  class 

AAAA  BBBB  CCCC 
AAAA  190  24  51 
BDBB  1 12  5 
CCCC  8 4G  151 
rejt  000 

tot l 199  82  20? 
corr  190  12  151 
Kcor  95.5  14.6  73.0 
eror  9 70  56 
Kerr  4.5  85.4  27.1 
rejt  000 
*rej  0.0  0.0  0.0 


total  number  of  vectors  » 488 
overall  correct  353  for  72.34* 

overall  error  135  for  27.66* 

overall  reject  0 for  0.00* 


Overall  Evaluation  Summary* 

Dataset  kdigrams  tttt  passed  against  logic  designed  on  firshalf 
Number  of  dimensions  * 14 


AAAA  95.48  4.52 
BBBB  14.63  85.37 
CCCC  72.95  27.05 


overall  correct 
72.34* 

overa  l l error 
27.66* 

overall  reject 

0.00* 


FIG  6.  OLPARS  NMU  ERROR  RATE 
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TRYOUT 

ENTER  OPTIONS 

t 

PNS 

OPENED  FEATURE  FILE  UITH  HEADER 
NAME, LABI, JUr  LB,!C,  MViIOPT rlHISt  FIRS,  FLAS 
FEAT  1111  17  8«  3 198  13  1 .50E-02  .I0E+01 

OPENED  CLAO  FILE  UITH  HEADER 

NAME  LAEIC  JBX  ICX  NTC  MEUC  MKV  NENT  NCIX  ISTM  IUKER 

CLAS  111101  17  21  3 50  0 25  19  1 0 

SUPSET  CLASS1  88 
SUMMARY  CONCLUSION 
.6582  .3413  0.0000 
CONFUSION  MATRIX 
198  1 86  1 12 

82  2 9 47  42 

191  3 24  23  52 


FIG  7.  BOX80  ERROR  RATE  (FOBU  SET  1) 
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dependent  upon  a full  covariance  matrix  for  each  class.  This  is 
many  times  more  expensive  in  computation  time  and  in  memory  usage 
than  the  B0X80  algorithm.  The  OLPARS  NMV  procedure  includes  an 
option  (-2)  based  upon  an  inverse  weighting  matrix.  This  is 
similar  to  the  B0X80  classifier  algorithm.  Fig  8 shows  that 
OLPARS1  error  rate  using  this  option  is  virtually  identical  to 
the  B0X80  error  rate.  Thus  the  B0X80  classifier  is  algorithmically 
acceptable.  (Note  that  although  B0X80,  OLPARS  and  SPSS  all  allow 
their  users  options  to  experimentally  define  parameters  which  may 
decrease  error  rates,  none  were  used  in  any  of  these  experiments.) 

(c)  A second  sample  of  vectors  from  the  FOBW  data  set 
was  processed  using  the  B0X80  system  and  using  the  OLPARS' 

NMV-2  option.  Figs  9 and  10  show  respective  error  rates  to  be 
again  nearly  identical.  Note,  however,  the  over  ten  percent 
increase  in  the  error  rate  for  this  sample  over  that  for  the  pre- 
vious sample.  This  is  simply  due  to  differences  in  the  data 
collected  for  each  sample.  The  overall  data  set  was  not  analyzed 
to  deliberately  extract  a worst-case  subset.  This  leads  to  a 
rhetorical  argument  which  is  presented  as  an  aside.  Assume  that 
this  second  sample  was  actually  the  initial  sample.  Allow  that  it 
v/as  accepted  as  the  design  test-bed.  Consider  the  development  and 
usage  costs  for  the  software  for  both  iterative  generation  of  a 
class  defining  structure,  and  for  implementation  of  the  classifier. 
Would  implementation  of  an  optimal  piecewise  linear  hyperplane 
be  justified? 
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FIG  8.  OLPARS  NMV-2  ERROR  RATE 
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TRYOUT 


FIG  9.  BOX80  ERROR  RATE  (FOBU  SAFlPLE  2) 


Partial  Nearest 

Nean  Hector  Evaluation! 

Humber  of 

d i mrsna  i ona  ■ 1 4 

true 

c lass 

AAAA 

BBBB 

CCCC 

AAA A 1S0 

12 

42 

BBB3  33 

58 

121 

CCCC  G 

11 

44 

rejt  0 

0 

0 

toll  199 

81 

207 

corr  ISO 

58 

44 

Xcor  80.4 

71.6 

21.3 

eror  39 

23 

163 

Xerr  19.6 

23.4 

73.7 

rejt  0 

0 

0 

Xrej  0.0 

0.0 

0.0 

total  number  of 

vectors  * 487 

overall  correct 

262  for  53. SOX 

overa  1 1 error 

225  for  46.26X 

overall  re 

ject 

0 for  0.80X 

Overall  Evaluation  Summary! 

Dataset  kdigrams  ttZt 

passed  against  logic  designed  on  firsbalf 

Number  of  dimensions  ■ 

14 

node  Xc  Xo  Xr 

AAAA  77.89  82.11  0.00 

overa l l correct 

52 . 66X 

BBB8  80.49  19.51  0.00 

overn l l error 

CCCC  17.39  82.61  0.00 

47.34X 

overa l l reject 

O.00X 

FIG  10.  OlPARS  NHU-2  SAMPLE-2  ERROR  RATE 
FOBU  DATA  SET 
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(d)  The  OLPARS  system  offers  a variety  of  feature 
evaluation  algorithms.  Two  were  used  to  evaluate  the  features  of 
the  samples  discussed  abo''e.  Fig  11  ranks  the  features  on  their 
ability  to  separate  class  pairs.  Fig  12  presents  overall  merit 
at  interclass  discrimination  and  ranks  features  in  this  order. 

Fig  13  presents  B0X80  merit  figures.  F/M  set  "1L0G"  corresponds 
to  the  matrix  discussed  earlier;  F/M  set  "2SUM"  corresponds  to 
the  M~  matrix.  Features  are  ordered  by  descending  figure  of 
merit.  It  was  noted  that  both  B0X80  sets  of  merit  figures  dis- 
agree with  OLPARS  feature  ranking.  Each  set  of  merit  figures  was 
then  compared  in  terms  of  the  classification  errors  which  its 

use  produced. 

(e)  As  discussed  earlier,  the  B0X80  feature  subset 
selection  process  operates  on  a set  of  proper  nested  feature  sub- 
spaces during  each  trial  recognition  of  the  test  data  set.  In 
Figs  7,  9 and  13  the  summary  conclusion  percentages  reflect  use 

of  the  complete  set  of  14  features  in  the  class  defining  structure. 
The  "subspace  tags’  list  gives  the  order  of  features  used  in  each 
of  the  nested  subspaces  which  are  evaluated.  Each  tag  denotes  the 
last  added  feature.  The  rates  presented  for  each  subspace  are  the 
percentage  correctly  classified  followed  by  the  percentage  in 
error.  The  nested  subspaces  are  first,  that  containing  the  left- 
most listed  subspace  tag,  and  then,  that  containing  the  left-most 
pair  of  tags,  and  so  forth.  Examination  of  Fig  13  shows  that 
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FIG  11.  OLPARS  OVERALL  FEATURE  RANK 
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FIG  12.  OLPARS  CLASS-PAIR  FEATURE  RANK 


performance  peaks  at  subspace  11  for  F/M  set  "1L0G"  and  at  sub- 
space 9 for  F/M  set  "2SUM".  In  both  cases  feature  2 has  just  been 
added  to  the  subspace.  Fig  14  shows  B0X80  use  of  a user  defined 
set  of  "subspace  tags"  which  includes  features  2 and  1.  Again  a 
performance  peak  is  noted. 

It  has  been  noted  that  exhaustive  search  is  the  only  method 
by  which  the  'best'  subset  of  features  can  be  found.  The  foregoing 
discussion  illustrates  how  B0X80  algorithms  can  be  used  to  guide  a 
heuristic  search  which  improves  performance  and  yet  is  not  exhaustive. 
It  also  illustrates  the  greater  strength  of  OLPARS1  feature 
evaluation  algorithm.  The  B0X80  subset  evaluation  technique  has 
no  counterpart  in  the  OLPARS  system  which  performs  each  classi- 
fication trial  separately. 

(f)  Fig  15  illustrates  the  B0X80  procedure  for  record- 
ing the  selection  of  a subset  of  features.  The  newly  generated 
class  defining  structure  produces  an  overall  error  rate  of  28 
percent.  Fig  16  shows  the  procedure  for  generating  scaled 
eight-bit  data  values  for  the  microprocessor  based  classifier. 

The  TRYOUT  module  option  ' B 1 requests  this  'byte'  scaling.  The 
zapping  process  referenced  in  the  figure  nullifies  specified  feature 
dimensions  (i.e.,  those  not  to  be  used),  by  arbitrerily  expanding 
the  value  of  their  respective  boundaries  (variances)  to  a large 
value.  This  is  further  discussed  in  chapter  5.  In  this  run,  a 
trial  recognition  was  then  accomplished  using  integer  arithmetic. 
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SPECIFY  SUBSPACE  TAGS 


TRYOUT 

ENTER  OPTIONS 

* 

FNS 

OPENED  FEATURE  FILE  WITH  HEADER 
NAHE.IAELiJDi  LB. IC.  IW.IOPTiIHIS 
FEAT  1111  17  S3  3 198  0 1 

OPENED  CLAS  TILE  WITH  HEADER 
NAME  LABLC  JDX  ICX  NTC 
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.50E-02  .1SE+01 
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CONFUSION  MATRIX 
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82  Z 3 41  54 

191  3 14  15  48 

SUBSET  CLASS=  0 


11  12  13  14 


FIG  15.  ERROR  RATE  - SELECTED  FEATURE  SUBSET 
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FIG  16.  ERROR  RATE 


BYTE-SCALED  CONPONENTS 


This  simulates  the  byte  valued  operations  actually  performed  in 
the  microprocessor.  Error  rate  increases  by  only  1.5  percent  and 
remains  below  both  the  error  rate  achieved  by  0LPARS(NMV-2)  and 
that  produced  by  B0X80  on  the  original  14  component  data  set. 

From  these  facts,  B0X80  procedures  for  subset  selection,  and  for 
generation  of  the  class-defining  structure,  are  judged  acceptable. 

(2)  Figs  17  through  23  apply  to  Fourier  transformed 
alphabetic  data.  This  data  set  consists  of  3900  feature  vectors 
of  49  components  each.  The  components  of  these  vectors  are  the  real 
and  imaginary  parts  of  complex  numbers.  These  numbers  are  output 
by  low  frequency  filtered  Fourier  transforms  of  two  space  images 
of  digitized  letters.  The  technique  used  to  produce  these  vectors 
has  been  discussed  in  several  AFIT  theses  (Ref  14,  31)  as  well  as 
in  the  as  yet  unpublished  work  by  Sponaugle  (Ref  33).  These  vectors 
form  a 26-class  problem.  Programs  produced  by  Sponaugle  were  used 
to  establish  benchmark  error  rates  for  classification  of  this  data. 

(a)  The  components  of  the  vectors  in  this  data  set 
were  assumed  to  be  largely  uncorrelated  because  they  had  been 
generated  by  an  orthogonal  linear  transform.  The  use  of  both 
real  and  imaginary  parts  of  the  values  output  by  this  transform 
suggests  the  caveat  'largely'  since  the  transform  produces  ortho- 
gonal complex  values.  The  size  of  the  data  set  precluded  use  of 
SPSS  to  generate  correlation  indices  as  was  done  with  the  FOBW 
data. 


L 


T 


78 


Fig.  17.  Alphabet  Classification  Experiments 
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(b)  This  data  set  was  processed  using  a classification 
program  written  by  Sponaugle.  The  program  uses  a minimum  distance 
algorithm.  It  produces  an  overall  error  rate,  a confusion  matrix 
and  individual  error  rates  for  each  alphabet.  Appendix  L records 
output  from  this  program  which  is  summarized  in  Fig  17.  Sponaugle 's 
work  included  heuristic  experimentation  which  attempted  to  establish 
appropriate  normalizing  transforms  with  which  to  precondition  the 
feature  vectors.  The  original  data  (after  application  of  centering 
algorithms  to  the  data  input  to  the  Fourrier  transform),  classified 
with  an  error  rate  of  18.4  percent.  Arguing  that  "thick"  letters 
would  in  general  have  larger  vector  magnitudes  than  "thin" 
letters,  as  is  shown  diagrammatically  by  vectors  and 
Sponaugle  normalized  the  feature  vectors  by  their  magnitudes  and 
again  classified  the  data.  His  least  error  rate  was  produced 

A 

by  experiment  D.  The  intuitively  difficult  combination  of  and 

A 

Px  in  this  experiment  may  be  explained  by  the  hypothesis  that  this 
normalization  retains  the  angular  variation  implicit  in  the  origi- 
nal data  vectors  while  standardizing  vector  magnitudes.  The 
B0X80  classifier  algorithm  was  integrated  into  this  minimum  dis- 
tance classifier.  A trial  classification  produced  the  7.1  percent 
error  rate  reported  in  the  figure  under  item  E.  The  decrease  in 
error  rate,  and  the  significant  increase  in  the  count  of 
alphabetic  fonts  recognized  as  identical,  qualifies  the  D0X80 
classifier  as  significant.  For  reference  by  future  AFIT  experi- 
ments, the  identically  recognized  alphabetic  fonts  are  recorded 
in  Table  II. 
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TABLE  II 


Identically  Recognized  Alphabets 

Experiment  A: 

28,  48,  104,  139,  16,  33,  35,  41 
Experiment  B: 

28,  9,  10,  139,  16,  33,  35,  75 
Experiment  C: 

28,  9,  10,  139,  16,  33,  35,  75 
Experiment  D: 

28,  9,  10,  139,  16,  33,  35,  75 
Experiment  E: 

28,  9,  10,  -,  33,  35,  75,  8,  15 

19,  26,  30,  32,  25,  27,  29,  48,  50,  58 
104,  127,  129,  133,  143,  41,  51,  66,  83,  90, 
103,  108,  116,  140,  144,  149,  150 


(c)  The  B0X80  system  v/as  used  to  process  a 780  vector 
subset  of  this  alphabetic  data.  A subset  was  used  only  to  reduce 
process  time;  it  does  not  affect  the  validity  of  this  benchmark. 

A confusion  matrix  for  this  process  is  shown  in  Fig  18,  with  an 
overall  error  rate  of  4.6  percent.  The  decrease  in  error  rate 
appears  to  correlate  with  the  fact  that  the  30-letter  sample 

per  class  used  in  this  experiment  included  10  of  the  "identical" 
alphabetic  fonts  reported  in  Table  II.  This  experiment  is 
significantly  different  from  that  reported  above  in  one  impor- 
tant respect.  As  noted  under  "system  efficiency"  in  this  section, 
the  B0X80  classifier  used  less  than  55K  of  memory  and  23  cpu 
seconds  for  its  operation.  However,  the  alphabet  classifier 
required  140K  of  memory  and  205  cpu  seconds  to  complete  a trial 
classification  run.  After  scaling  this  execution  time  by  the 
reduced  size  of  the  B0X30  data  sample,  a 2:1  throughput  increase 
is  still  indicated.  The  minimal  B0X80  memory  use  results  from 
its  efficient  data  structures.  This  contrasts  to  the  far  greater 
memory  requirement  of  the  alphabetic  classifier.  It  should  be 
noted  that  the  alphabetic  classifier  accumulates  and  stores 
extensive  statistics  for  output;  these  account  for  part  of  its 
memory  requirement.  The  classification  rate  presented  for  this 
set  of  49  component  alphabetic  feature  vectors  correlates  well  with 
Tallman's  simulated  result,  95.80  (Ref  35:86). 

(d)  Figs  19  through  21  show  B0X80  merit  figures 
computed  for  this  49  component  alphabetic  data  set.  Notice  that 
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FIG  19.  ERROR  RATE  FOR  FIERIT  1 SUBSPACES 
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FIG  20.  ERROP  RATES  FOR  MERIT-2  SUBSPACES 
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FIG  21.  ERROR  RATES  FOR  ARBITRARY  FEATURE  SUBSPACES 
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subspace  error  rates  all  decrease  as  the  number  of  subspace 
features  increases.  At  subspace  20  the  error  rates  are  11,  9, 
and  18  percent  for  merit  figures  1,  2,  and  0,  respectively. 

(The  0 set  consists  of  an  arbitrary  49  components  by  order  of 
increasing  dimension.)  These  error  rates  support  two  conclu- 
sions. First,  the  B0X80  system  outperforms  the  benchmark  in  both 
error  rate  and  number  of  features.  Second,  F/M  set  2 has  the  lowest 
error  rate  and  is  the  more  robust  of  the  two  figures  of  merit.  This 
agrees  with  the  analysis  reported  for  the  FOBW  data  set. 

(e)  Fig  22  presents  confusion  matrices  and  overall 
error  rates  for  feature  subspace  20  from  F/M  set  2.  The  majority 
of  the  errors  are  concentrated  in  separating  classes  15/17  and 
22/23.  These  classes  represent  the  letters  0 and  Q and  the  letters 
V and  W which  are  readily  confused  by  printed  noise. 

(f)  Fig  23  shows,  again  via  simulation,  an  overall 
error  rate  and  a confusion  matrix  for  byte  sealed  component 
values.  It  indicates  that  the  B0X80  system  development  hypothesis 
is  justified.  That  is,  with  a minimized  use  of  memory,  and  the 
B0X80  classifier,  an  acceptable  error  rate  can  be  attained. 

Acceptability.  The  term  "acceptable"  has  been  used  freely 
in  the  foregoing  discussion.  From  its  last  use,  in  the  context 
of  all  foregoing  discussion,  a precise  meaning  can  be  inferred. 
Acceptability  is  a complex  function  of  cost  and  benefit.  However, 
it  is  a relative  term  which  implies  not  only  that  resources  meet 
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costs  and  benefits  satisfy  requirements,  but  also  that  a value 
judgment  has  been  made  for  each  case.  This  is  why  no  one  defini- 
tion was  given. 
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Testing  Procedures 

In  implementing  the  B0X80  system,  testing  was  a continuing 
process.  Techniques  varied  with  the  routine  or  function  being 
tested.  These  are  indicated  below. 

In  each  module  the  data  processing  flow  was  evaluated  by 
a trace  at  subroutine  exit.  Single  entry,  single  exit  subroutine 
paths  and  selective  output  to  either  the  journal  file  or  the 
terminal  made  this  technique  effective.  Data  buffer  dumps  were 
obtained  from  file  generation  processes  to  verify  input  structure 
and  content.  To  simplify  verification  of  all  modules,  the  basic 
utility  routines  were  independently  tested.  This  procedure  v- 
not  followed  for  support  routines  unique  to  each  module  because 
of  the  overhead  cost  for  testing  drivers.  Finally,  a simulator, 
INTERP80  (Ref  15)  was  used  to  exercise  the  data  processing  opera- 
tions of  the  classifier  module. 

Computational  code  was  verified  by  spot-checked  hand 
calculations,  analyses  for  self-consistency,  and  comparisons  with 
known  values.  In  the  latter  case,  benchmark  testing  provided 
comparison  values.  Output  from  these  benchmarks  included  statis- 
tics produced  via  the  Statistical  Package  for  the  Social  Sciences 
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(SPSS),  feature  selections  identified  by  the  On  Line  Pattern  Analy-  i 

sis  and  Recognition  System  (OLPARS),  and  classification  decisions 
obtained  from  specially  written  pilot  routines.  Finally,  a trivial 
data  set  was  used  to  verify  the  computations  within  the  micro- 
processor classifier  module. 

Function  options  were  verified  by  an  attempt  at  exhaustive 
testing.  For  each  option,  output  values  were  examined,  and  file  , 

and  module  interfaces  were  checked. 

Several  special  tests  were  used.  Graphics  routines  were 
deliverately  passed  invalid  data  to  verify  program  continuity; 
there  were  no  unexpected  hang-ups.  Feature  selections  were  input 
to  feature  subset  procedures  and  used  in  performance  measurements. 

Finally,  data  from  two  disparate  data  sets  were  processed  with  the 
system.  Thus,  memory  allocation  algorithms  and  other  adjustments 
for  number  of  classes  and  dimensions  were  checked. 
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V.  Design 


This  chapter  presents  the  design  of  the  B0X80  system. 

The  flow  of  data  through  the  system,  processing  techniques  and 
routines,  and  system  data  structures  are  discussed  in  the  first 
three  sections.  The  final  sections  document  the  design  of 
system  modules. 

Data  Flow 

The  functions  of  the  B0X80  system  separate  into  two  broad 
groups.  To  one  group  are  assigned  functions  dealing  with  the 
evaluation  of  feature  data  and  the  generation  of  class  defini- 
tions. The  other  group  contains  the  microprocessor-based  classi- 
fication function.  This  separation  conforms  to  the  functional 
analysis  of  data  flow  presented  in  Chapter  3.  The  system  is 
thus  implemented  in  two  segments  of  program  code.  Each  consists 
of  independent  program  modules  which  interact  through  standard 
data  files. 

Interpreter  Segment.  This  segment  consists  of  four  inde- 
pendent modules  whose  functions  allow  the  user  to  examine  his 
feature  data  and  to  produce  a standard  set  of  class  definitions. 
These  definitions  are  the  primary  product  of  the  interpreter 
segment.  They  link  this  segment  to  the  second  segment.  The 
four  modules  of  this  segment  are  named  CREATE,  DEFINE,  TRYOUT, 
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and  FORMAT.  These  names  reflect  their  basic  functions. 

The  flow  of  data  through  the  Interpreter  Segment  is  in 
a circular  path.  Segment  modules  are  executed  by  the  user  in 
an  iterative  cycle.  The  cycle  ends  when  the  user  is  satisfied 
with  the  simulation  of  classifier  performance  which  is  docu- 
mented by  the  TRYOUT  module.  At  this  point,  the  classifier 
error  rate  should  be  acceptably  low.  In  each  iteration  a file 
of  pattern  class  definitions  is  produced.  Execution  of  the 
FORMAT  module  can  transform  this  data  structure  into  one  which 
will  interface  with  the  Classifier  Segment.  This  is  the  final 
step  in  the  interpretation  process. 

Classifier  Segment.  This  segment  consists  of  two  inde- 
pendent modules.  One  functions  as  a data  input  routine.  It 
allows  the  user  to  enter  class  defining  data  into  microprocessor 
memory.  The  second  module  is  a pilot  model  of  pattern  classi- 
fier which  can  be  used  in  the  user's  system.  It  processes  a 
buffer  of  feature  vectors  against  a block  of  class  definitions 
and  outputs  a classification  decision  for  each  vector.  The 
modules  in  this  segment  are  known  as  TAPEIN  and  DECIDE. 

The  Classifier  Segment  is  intended  as  a test-bed  with 
which  to  exercise  a classifier  module  which  has  been  configured 
to  satisfy  a user  system.  In  such  a system,  a distributed 
process  would  implement  the  user's  pattern  recognition  function. 
One  microprocessor,  operating  in  master  mode,  would  perform  the 
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analog  to  digital  conversions,  feature  extractions,  and  trans- 
forms necessary  to  generate  a feature  vector  for  a given  pattern. 
This  microprocessor  would  interrupt  a slave  processor  to  store 
each  feature  vector  in  a RAM  memory  buffer  accessible  to  the 
slave.  The  slave  processor  would  continuously  operate  on  the 
contents  of  this  buffer,  producing  as  output  a log  of  classifi- 
cation decisions.  The  B0X80  system  Classifier  Segment  illustrates 
this  design  concept  by  demonstrating  a classifier  program  which 
can  be  used  in  the  slave  microprocessor.  The  data  formats  and 
program  code  for  this  slave  processor's  software  are  a version 
of  the  Classifier  segment's  DECIDE  module. 

The  flow  of  data  through  the  Interpreter  and  Classifier 
segments  of  the  B0X80  system  can  be  visualized  as  a straight 
line  path.  At  execution  of  system  modules  along  this  path  various 
data  files  are  created.  Files,  in  general,  are  not  updated. 
Rather,  new  files  are  created  based  upon  the  user's  analytical 
judgment.  Any  part  of  this  path  can  be  repeated.  Thus  the  B0X80 
system  data  flow  supports  iterative  development  of  the  classifi- 
cation data  structure  upon  which  the  user's  pattern  recognizer 
is  based.  This  flow  is  illustrated  in  Figure  24.  Names  of  the 
modules  and  routines  of  the  B0X80  system  which  implement  this 
flow  are  listed  in  table  III.  These  names  are  defined  in 


TABLE  III 

MODULE  AND  ROUTINE  NAMES 
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table:  iu  (i/3) 
MODULE  AND  ROUTINE  DEFINITIONS 


1.  CREATE 

DEFC 

SCAN 

COPY 

GETFEA 

PRHIST 

2.  DEFINE 

DEFD 

ALLOC 

NEXCLA 

KERPUT 

CLASSX 

CDEFI 

FANDER 

SHUCK 

SETUM 

3.  TRYOUT 

DEFT 

MERIT 

FIGM 

SUBSET 

EVAL 

DOCU 

LOCK 

4.  FORMAT 

DEFF 

XCLAS 

XHIST 

XFEAT 

FILBUF 

XMIT 

NEXREC 

NEXVEC 


- Generates  FEAT  file  from  user  data 

- Initializes  CREATE  module 

- Produces  "first-pass"  statistics  on  features 

- Generates  FEAT  file  records 

- Reads  user  data  file 

- Prints  statistics  and  histograms 

- Generates  CLAS  file  from  FEAT  records 

- Initializes  DEFINE  module 

- Allocates  memory  to  module  buffers 

- Controls  selection  of  class  to  be  processed 

- Updates  class  husk  list 

- Controls  processing  of  class  data 

- Updates  orotot.ype  definitions  and  histograms 

- Produces  feature  vectors  as  husk  members 

- Identifies  feature  vectors  as  husk  members 

- Inserts  feature  boundaries  into  CLAS  file 

- Produces  error  rates  and  feature  subsets 

- Initializes  TRYOUT  module 

- Computes  figure  of  merit  for  each  feature 

- Presents  and  accepts  feature  merit  ranking 

- Tags  dimensions  for  elimination 

- Performs  trial  recognition 

- Outputs  error  rate  and  confusion  matrix 

- Establishes  subspace  error  rates 

- Produces  microprocessor  data  and  displays 

- Initializes  FORMAT  module 

- Controls  processing  CLAS  file 

- Controls  processing  HIST  AND  DIST  file 

- Controls  processing  FEAT  file 

- Loads  buffer  with  PICT  and  STRIP  input 

- Sends  values  to  hexadecimal  format  routine 

- Inputs  user  selection  of  data  class 

- Inputs  user  selection  of  vector 
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TABLE  IV  (2/3) 

flODULE  AND  ROUTINE  DEFINITIONS 


TAPE  IN 
BYTEX 

DECIDE 

CLOOP 

OUTB 

UTILITIES 
INITC 
ADD 
DEL 
RIX 
INDEX 
KERGET 
PRCLAS 
LOAD  C 
OPENH 
OPENX 
RFEAT 
RHIST 
WRCLAS 
WRHIS 
STAT'I 
STATX 
XSCAL 
GETCH 
Cl 

CNVBN 

DIV 

HILO 

BNBCD 

COUT 


- Decodes  and  loads  cassette  tape  into  SBC  80/20  ROM 

- Reads  a pair  of  hexadecimal  characters 

- Microprocessor  classifier  module 

- Outputs  a string  of  characters 

- Outputs  a buffer  of  binary  values  as  characters 

- [General  Purpose  System  Routines] 

- Initializes  CLAS  file  index  chain 

- Adds  entry  to  CLAS  file  index  chain 

- Deletes  entry  from  CLAS  file  index  chain 

- Reads  CLAS  file  index  chain 

- Builds  CLAS  file  table  index;  scales  file 

- Accesses  CLAS  file  husk  list 

- Prints  CLAS  file 

- Loads  CLAS  file  buffer 

- Opens  HIST  AND  DIST  files 

- Opens  FEAT  and/or  CLAS  files 

- Reads  FEAT  file  record 

- Reads  HIST  file  record 

- Writes  CLAS  file  record 

- Writes  HIST  file  record 

- Updates  histogram 

- Updates  statistics 

- Scales  FEAT  and  CLAS  vectors 

- Reads  a character  (SBC  80/20) 

- Input  from  RS232  port  (SBC  80/20) 

- Converts  to  binary  (SBC  80/20) 

- Divides  16  bits  by  8 bits  (Interp  80) 

- Compares  16  bit  values  (SBC  80/20) 

- Binary  to  BCD  conversion  (INTEL  User  Library) 

- Character  output  routine  (SBC  80/20) 


MODULE 


SUPPORT 

ENER 

MARK 

PL0T3D  - 

PLX 

ILINE 

IASORT  - 
FDSORT  - 
ERROR 
GETCM 
ERR 


TABLE  IU  (3/3) 

AND  ROUTINE  DEFINITIONS 


(Specialized  Support  Routines) 

Computes  'energy'  and  string  of  values 
Draws  tic  mark  on  TEKTRONIX  screen 
Hidden  line  routine  draws  3D  surfaces 
Emulates  CALCOMP  plot  routine 
Generates  Intel  hexadecimal  byte  format 
Integer  ascending  sort 
Floating  point  descending  sort 
Generates  error  prompt  (SBC  80/20) 

Gest  next  user  command  (SBC  80/20) 
Generates  error  prompt 


System  Subroutines 

In  this  section  standard  supporting  techniques  for  data 
manipulation  are  discussed.  Additionally  abstracts  of  utility 
and  support  routines  are  presented. 


Module  Initialization.  Each  system  module  is  initialized 
by  a subroutine  which  establishes  standard  file  names,  and 
allocates  memory  from  a single  work  area  to  the  file  buffers  and 
tables  required  for  processing.  Record  block  sizes  within  each 
system  file  are  set  by  the  user  at  system  initialization.  These 
two  techniques  simplify  transport  of  the  prototype  generation 
segment  from  one  FORTRAN  capable  system  to  another.  They  allow 
adjustments  for  memory  and  on-line  storage  variations  in  differ- 
ent systems.  Record  block  sizing  algorithms  establish  pointers 
to  starting  locations  of  module  buffers  by  iteratively  adjusting 
data  parameters.  These  are  then  output  for  user  approval  of 
buffer  size  adjustments. 

File  Processing.  In  order  to  design  efficient  structures 
for  feature  vector  and  prototype  data,  usage  and  access  patterns 
were  analyzed.  A file  structure  was  selected  over  the  use  of 
incore  buffers  so  as  to  allow  greater  data  volume.  Implementa- 
tion using  separate  modules  was  selected  in  order  to  enhance 


transportability.  The  requirement  to  generate  prototypes  via 

an  intPMrtivp.  tirnn  charinn  nmrotc  raicoH  nnocfinnc  ahmif 


modules  is  consistent  with  use  of  minimum  amounts  of  core  and 
time  to  complete  a given  function.  A standard  file  structure 
was  established  for  data  communication  between  modules.  This 
structure  consists  of  four  files  which  are  defined  in  the  next 
section. 

Requirements  to  access  each  file  were  analyzed  in  the 
process  of  defining  structure.  The  feature  vector  data  has 
greatest  potential  volume  and  least  need  for  non-sequential  access. 
Conformance  to  ANSI  FORTRAN  specifications  dictated  a sequen- 
tial access  method  but  allowed  a BACKSPACE  operation.  Thus  a 
disk  or  tape  based  sequential  file  was  selected  for  this  data. 
However,  prototype  data  is  accessed  frequently  in  iterative 
processing,  and  is  not  necessarily  only  used  sequentially. 

Again  conforming  to  ANSI  FORTRAN  capabilities  dictated  use  of 
a sequential  file.  However,  since  its  volume  is  limited,  a single 
record  approach  was  chosen.  Use  of  an  embedded  index  to  the 
data  vectors  associated  with  each  prototype  supported  efficient 
use  of  in-core  storage  of  this  record.  This  technique  also 
supports  revision  to  a multi-record  random  access  file  structure 
in  environments,  such  as  with  minicomputer  hosts,  in  which  there 
is  extremely  limited  central  memory.  Finally,  histogram  data 
appeared  to  be  too  voluminous  for  incore  storage,  and  too 
infrequently  used  for  a multi -file  solution  to  the  restriction 
posed  by  the  ANSI  sequential  file  standard.  Therefore,  two  files 
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of  the  same  format  were  designed.  One,  with  a single  data 
record,  contains  universe  distributions.  The  other,  containing 
one  record  for  each  data  class,  records  distributions  of  data 
within  each  data  class.  These  four  files  are  labeled  DIST 
(universal  distributions),  HIST  (class  distributions),  CLAS 
(prototypes)  and  FEAT  (feature  vectors). 

Utility  Routines.  There  are  three  types  of  subroutines 
within  the  B0X80  system.  In  the  first  group  are  routines 
uniquely  specialized  to  support  primary  modules.  These  are 
covered  in  the  next  chapter.  General  purpose  utility  routines 
are  synopsized  below.  Table  D-V  gives  calling  parameters  and 
their  definitions.  Special  purpose  support  routines  having 
general  usefulness  are  discussed  in  the  next  paragraph. 

(1)  ADD.  One  of  four  routines  which  access  the  index  to 
the  prototype  data  file,  this  routine  inserts  a new  entry  to 
that  index.  The  index  is  described  in  the  next  section.  It 
contains  two  chains  of  entries.  The  entries  in  one  chain  corres- 
pond to  column  vectors  in  the  CLAS  file  data  record.  The  entries 
in  the  other  chain  correspond  to  unused  column  vector  positions. 
This  ADD  routine  follows  the  'used'  entry  chain  to  the  appro- 
priate position  and  relinks  an  index  entry  to  hat  position  from 
the  top  of  the  'used'  chain. 

(2)  DEL.  This  routine  deletes  an  entry  from  the  index 
to  the  prototype  data  file.  Deletion  is  effected  by  relinking 

| ) 
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around  the  indicated  index  entry  and  adding  the  newly  freed  entry 
to  the  unused  chain.  This  routine  does  not  clear  the  associated 
column  vector;  it  is  therefore  uncoupled  from  its  referenced  data 
area.  This  eases  data  structure  modifications. 

(3)  INITC.  This  routine  sets  constant  parameter  into  the 
prototype  data  file  index  during  file  initialization.  See 
Figure  25  for  a sketch  of  these  initialization  entries. 

(4)  RIX.  This  routine  reads  the  index  to  the  prototype 
file  and  extracts  the  entry  number  of  the  named  vector.  The 
appropriate  index  entry  is  found  via  a sequential  search  of  the 
'used'  entry  chain. 

(5)  INDEX.  In  order  to  speed  retrieval  of  the  address  of 
named  prototype  data,  this  routine  builds  a table  of  51  three 
position  entries.  Each  position  records  the  address  of  a prototype 
vector.  Entry  51  records  the  address  of  a pair  of  data  limits 
vectors.  At  option,  this  routine  controls  scaling  of  prototype 
data  into  a specified  bit  range. 

(6)  KERGET.  A set  of  vectors  within  the  prototype  file 
records  identifiers  of  feature  vectors  which  have  been  assigned  to 
the  husk  of  each  class.  This  routine  obtains  the  identifier  for 
the  "next"  feature  vector  assigned  to  the  husk  of  a given  class. 

(7)  PRCLAS.  This  utility  routine  prints  the  three  data 
types  stored  within  the  prototype  file.  The  file  index,  the  set 
of  husk  vectors  for  each  class,  and  the  prototype  definition  for 
the  class  are  printed. 


) 

{ 

i 

l 


) 


103 


(8)  LOADC.  This  routine  loads  the  prototype  data  file  into 


1 


> 
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the  proper  program  buffer. 

(9)  OPENH.  This  routine  reads  the  header  record  from 
HIST  and  DIST  files  setting  the  x-dimension  memory  parameter 
associated  with  the  data  records  of  the  file. 

(10)  OPENX.  This  routine  reads  header  records  from 

FEAT  and  CLAS  files.  The  parameters  set  include  the  y-dimension 
memory  variable  corresponding  to  number  of  data  columns  within  the 
CLAS  file.  These  open  functions  are  coupled  so  that  label  tests 
can  be  made  in  one  place.  At  input  options,  the  FEAT  file  or  the 
CLAS  file  open  can  be  bypassed.  This  is  necessary  when  the  CLAS 
file  is  either  used  alone,  or  is  to  be  initialized  or  extended  in 
size. 

(11)  RHIST.  This  utility  reads  records  from  either  DIST 
or  HIST  files.  A sequential  search  is  made  for  the  requested 
record,  and  no  backspace  or  rewind  option  is  provided.  Records 
containing  histogram  pairs  are  flagged.  An  error  indicator  is 
set  if  a missing  record  is  requested  and  an  end  job  flag  is  set 

t 

when  an  illegal  record  is  requested. 

(12)  RFEAT.  This  utility  routine  handles  input  from  the 
FEAT  file.  A rewind  option  and  a backspace  option  support 
reprocessing  the  entire  file  or  the  current  record  block.  A 
sequential  search  is  made  for  the  requested  record.  Missing 
records  cause  a fatal  error  flag  to  be  set.  Each  block  is 
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checked  for  the  last  block  flag;  a pointer  to  the  last  vector  of  ! 

the  block  is  updated  when  this  block  is  read.  \ 

(13)  WRCLAS.  This  subroutine  writes  the  CLAS  file  from 
memory  onto  its  file.  In  this  process  it  assembles  and  outputs 
the  CLAS  file  header. 

(14)  WRHIS.  This  subroutine  writes  the  HIST  and  DIST 

file  records.  It  assembles  and  outputs  the  file  header  record,  , 

and  provides  a printout  of  statistics  and  distribution  values 
if  requested. 

(15)  STATH.  This  routine  generates  a histogram  from  the 
stream  of  values  input  on  successive  calls.  The  current  histogram 
is  always  output.  At  receipt  of  a last-call  indicator  a mode  and 
the  percent  of  all  values  associated  with  this  mode  are  calculated. 

(16)  STATX.  First  and  second  order  moments,  minimum  and 
maximum  values  are  computed  from  a stream  of  input  values. 

Temporary  values  are  initialized  at  first  input  which  is  signalled 
to  the  routine  by  a zeroed  work  area  parameter.  The  current 
minimum  and  maximum  are  always  output.  A last  call  indicator 

I 

triggers  generation  of  mean  and  variance  values.  At  option  either 
the  standard  deviation  from  the  computed  mean,  or  an  average 
deviation  from  a zero  mean  is  computed.  The  latter  deviation  is 
used  for  definition  of  assymmetric  class  boundaries. 

(17)  XSCAL.  Components  of  vectors  input  to  this  routine 
are  scaled  to  a specified  range  of  values.  Either  of  two  scaling 
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algorithms  is  selectable  for  use  with  prototypes  or  with  class 
diagonal  covariance  matrices.  (See  Chapter  IV.) 

Support  Routines.  The  subroutines  described  below  each 
support  unique  functions  within  the  B0X80  system.  However,  since 
these  routines  have  a conceptually  general  utility  they  are  dis- 
cussed as  a group  here.  Table  D-VI  gives  calling  parameters  and 
their  definitions. 

(1)  MARK.  This  routine  generates  a tic  mark,  of  specified 
direction  and  length,  on  the  TEKRONIX  screen. 

(2)  ENER.  This  routine  computes  the  magnitude  squared,  or 
sum  of  the  squares  of  the  component  values  of  any  vector. 

(3)  PL0T3D.  This  subroutine  executes  a hidden  line 
analysis  and  perspective  transformation  in  order  to  produce  a 
three-dimensional  plot  of  a two-dimensional  array  containing 
z-axis  values.  The  subroutine  listing  contains  added  comments 
and  a source  reference. 

(4)  PLX.  This  routine  simulates  the  CALCOMP  routine 
PLOT  insofar  as  necessary  to  translate  the  capability  of  PL0T3D 
from  CALCOMP  to  TEKTRONIX  output. 

(5)  ILINE.  This  routine  reformats  a string  of  16  integer 
values  into  a paper  tape  data  format  used  on  various  micropro- 
cessor systems.  This  line  format  is  presented  in  Table  D-V III. 

A process  switch  controls  operation  of  this  routine.  It  allows 
generation  of  initial  line  address  characters  and  output  of  special 
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end  of  record  line  format.  Integer  to  hexadecimal  encoding  uses 
only  ANSI  standard  FORTRAN  operations. 

(6)  IASORT.  This  routine  sorts  an  input  array  of  N inte- 
gers into  ascending  order.  The  input  data  sequence  is  lost. 

(7)  FDSORT.  This  routine  sorts  a pair  of  input  arrays  into 
descending  order  based  on  the  real  value  of  members  of  one  array. 

It  preserves  the  input  data  value  sequence. 

(8)  DIV.  This  8080  routine  divides  a 16-bit  dividend  by 

an  8-bit  divisor  producing  an  8-bit  quotient  and  an  8-bit  remainder 
(Ref  15:18). 

(9)  BNBCD.  This  8080  routine  converts  a 16-bit,  two  byte 
integer  into  a string  of  5 ASCII  character  codes  (Ref  19:18). 

(10)  HILO.  This  8080  routine  compares  two  16-bit  unsigned 
integers  and  sets  the  8080  carry  condition  indicator  to  show  a 
less  than  or  equal  condition , (Ref  18: B-26 ) . 

(11)  C0UT.  This  routine  sends  a single  byte  to  the  RS232 
port  of  an  8080  system  if  the  port  is  ready  to  write. 

(12)  CNVBN.  This  8080  routine  converts  the  BCD  represented 
hexadecimal  characters  to  their  integer  values  (Ref  9:8-23). 

(13)  GETCH.  This  8080  routine  reads  ASCII  valued  characters 
and  strips  the  parity  bit  (Ref  19 : B-20 ) . 

User  Input  Routine.  To  make  the  CREATE  module  as  general 
as  possible  a user  supplied  routine  is  referenced  to  read  the 
user  file  of  feature  data.  The  specifications  for  this  module 
are  described  below. 
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GETFEA.  This  routine  reads  the  file  of  feature  data  sup- 
plied by  the  user.  Its  calling  parameters  are  defined  in  Table 
D-VIII.  On  the  first  entry  to  this  routine,  the  user  data  file 
is  rewound.  On  the  first  and  all  successive  entries  a feature 
vector  is  returned.  Additionally  a feature  vector  identifier 
(number  of  the  vector  within  its  class),  a class  identifier,  and 
an  error  flag  are  returned  on  every  ca1!.  If  the  vector  returned 
is  not  the  last  of  its  class,  this  flag  is  set  to  zero.  If  this 
vector  is  the  last  of  its  class,  this  flag  is  set  to  -1.  If  this 

fector  is  the  last  of  the  file,  this  flag  is  set  to  +1„  Once  the 

last  vector  of  the  user  file  has  been  output  the  routine  must 

reset  all  internal  flags  to  allow  for  rewind  on  the  next  call. 

Input  to  the  routine  includes  a file  name  and  buffer  location 
with  size  as  well  as  an  option  switch  with  nine  settings  for  use 
as  desired. 

Data  Structures 

In  this  section  descriptions  are  presented  for  each  of 
the  data  structures  used  within  the  B0X80  system.  Separate  para- 
graphs below  discuss  the  'iv'qn  of  each  file  and  define  its  for- 
mat. There  are  six  system  files.  Associated  with  one  file  is 
an  index  structure  which  is  described  separately.  The  system 
names  for  these  files  are  stated  with  brief  definitions  below. 

(1)  FEAT.  This  file  contains  feature  vectors  ordered 
by  class. 
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(2)  CLAS.  This  file  contains  class  prototype  definitions 
and  an  imbedded  index  (referenced  as  LIST)  to  class  definitions. 

(3)  DIST.  This  file  contains  component  statistics  and 
distributions  on  the  population  of  data  vectors  input  to  the 
CREATE  module. 

(4)  HIST.  This  file  contains  statistics  and  distributions 
on  the  data  within  each  class  processed  by  the  DEFINE  module. 

(5)  PROT.  This  file  contains  prototype  definitions  in 
a format  suitable  for  use  by  the  DECIDE  module. 

(6)  FVEC.  This  file  contains  feature  vectors  in  a format 
suitable  for  input  to  the  DECIDE  module. 

Features  Data  File  (FEAT).  This  file  is  ordered  by  pattern 
class  and  consists  of  multi-block  records  with  one  record  per 
pattern  class.  Each  data  block  has  the  same  format  and  has  a 
fixed  size.  This  size  is  fixed  at  file  creation,  as  earlier 
stated,  to  give  the  user  control  over  use  of  his  available  memory 
resource.  The  first  record  of  the  file  is  a single  header  block. 
The  last  record  of  the  file  is  a single  trailer  block.  Formats 
for  these  three  block  types  are  given  in  Table  A-I.  The  file  is 
produced  by  the  CREATE  module  which  reformats  input  data  from  a 
user  file  and  adds  several  tag  values  to  each  data  vector. 

The  FEAT  file  structure  is  designed  to  allow  variably 
sized  data  sets  to  be  retrieved  efficiently.  Each  vector  within 
a block  is  tagged  with  its  own  identifier  and  the  identifier  of 
its  class  for  ease  in  documenting  trial  error  rates,  as  well  as 


for  convenience  in  manually  referencing  file  content.  The  header 
record  was  required  to  allow  input  of  data  into  a variably  sized 
buffer  when  the  file  is  read.  Its  critical  parameter  gives  the 
x-dimension  of  this  input  data  buffer.  The  trailer  record  stores 
minimum  and  maximum  component  values  for  later  use  in  scaling 
feature  vector  components  to  a byte-sized  value  range. 

Class  Definitions  File  CLAS).  This  file  consists  of  two 
records.  The  ten-word  header  record  identifies  the  file  and 
establishes  the  size  of  the  prototype  data  record.  This  size  is 
set  by  the  user  during  operation  of  the  DEFINE  module.  The  data 
record  contains  a set  of  vectors  and  a file  index  known  as  LIST. 
This  index  will  be  discussed  in  another  paragraph.  Data  vectors 
are  of  three  types.  Each  class  is  defined  by  a subset  of  these 
vectors  containing  from  two  to  nine  elements.  Vector  types  include 
a class  prototype  or  mean  vector,  a class  boundary  or  deviation 
vector,  and  a class  husk  vector.  Prototype  and  deviation  vectors 
are  directly  used  in  the  classification  algorithm.  The  husk 
vector  is  used  in  the  identification  of  candidate  feature  vectors 
for  tne  formation  of  mean  and  deviation  values.  Deviation  vec- 
tors represent  an  uncorrelated  covariance  of  feature  vector 
components  within  the  class.  A processing  option  allows  these 
vectors  to  represent  positive  and  negative  zero-based  deviations 
of  class  feature  vectors  from  the  mean  vector.  In  either  case. 
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the  classification  algorithm  operates  upon  these  deviation  vectors 
as  diagonalized  matrices.  Thus  the  term  vector  is  used  at  this  point 
only  for  parallelism  and  simplicity  since  a vector  is  indeed  a 
linear  list  of  components.  Formats  for  the  file  header  and  for 
these  vectors  are  given  in  Tables  A-II  and  A- III,  and  in  Figs.  25  to  27. 
(Note:  The  structure  of  the  data  buffer  for  the  CLAS  file  data 
record  is  intricate.  The  content  of  this  buffer  is  varied.  The 
referenced  figure  and  tables  must  be  examined  as  a set  in  order  to 
understand  this  structure  and  content.) 

Column  vectors  within  the  CLAS  file  data  record  have  JD 
dimensions  and  have  three  tags.  These  tags  extend  column  size 
to  JDX.  Tag  three,  for  all  vector  types,  represents  the  class  id. 

Tag  two  carries  a code  indicating  vector  type.  Tag  one  stores 
two  types  of  value:  for  class  mean  and  deviation  vectors  it 
contains  the  least  and  greatest  component  values  in  the  class  for 
component  scaling  in  the  cluster  plot  process;  in  class  husk 
vectors  it  contains  the  number  of  the  next  open  vector  component 
for  use  in  husk  manipulations.  Tags  one  and  two  are  used  as 
identifiers  for  their  respective  vectors  in  character  printouts 
of  this  data  record. 

Class  Definition  File  Index  (LIST).  This  structure  is 
an  array  with  two  items  per  entry,  having  two  more  entries  than 
there  are  columns  for  data  vectors  within  the  CLAS  file  data 
record.  These  entries  provide  an  index  to  the  data  record.  The 


111 


CHAIN  STRUCTURE: 


UTOP  - Points  to  position  of  first  'unused'  entry 
KTOP  - Points  to  position  of  first  used  entry 


LAST  - NAME  of  logically  last  entry 


NAME  - Integer  name  of  associated  vector 


LINK  - Points  to  position  of  next  logical  entry 


NENT  - Number  of  index  entry  positions 


Fig.  25.  CLAS  File  Index  Structure 


CSDR 


Fig.  27.  CLAS  File  Data  Record  - Vector  Tags 


first  item  of  each  entry  contains  a code  for  the  name  of  a given 
vector  in  the  data  record.  The  second  item  of  the  entry  contains 
a pointer  to  the  list  entry  position  which  contains  that  entry 
which  logically  follows  the  present  entry  according  to  the  value 
of  its  name.  There  is  obviously  one  entry  in  this  index  for  each 
column  vector  in  the  CLAS  file  data  record.  The  two  extra  entries 
in  this  index  are  described  by  Fig  2T>.  The  first  of  these  entries 
is  always  the  first  entry  in  the  array.  Its  first  item  points 
to  the  first  unused  entry  of  the  array.  Its  second  item  points 
to  the  first  used  entry  in  the  array.  The  second  entry  in  the 
array  is  a dummy  entry  whose  name  indicates  logical  'last'.  The 
pointer  item  in  this  entry  is  arbitrary  since  the  index  entry 
chain  is  not  circular.  These  entries  are  used  by  the  index 
service  routines  to  maintain  the  logical  chain  of  name  items. 
Because  of  these  two  entries,  whose  physical  positions  are  fixed, 
the  name-item  in  each  functional  entry  in  the  array  refers  to  a 
column  vector  in  the  data  record  whose  column  number  is  two  less 
than  the  list  entry  position  of  that  name-item. 

This  technique  of  indexing  the  column  vectors  within  the 
CLAS  file  data  record  was  chosen  for  two  reasons.  It  allows 
non-sequential  generation  of  each  of  the  vectors  within  the 
prototype  set  for  a given  class  without  requiring  the  reservation 
of  a fixed  amount  of  space  for  the  vector  set  for  each  class. 
Moreover,  it  supports  convenient  revision  of  the  memory  allocation 
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to  the  CLAS  file  data  record  by  allowing  record  extension  under 
program  control  as  well  as  straight-forward  modification  of  the 
in-core  data  structure.  This  latter  fact  adds  to  transporta- 
bility of  the  prototype  generation  segment  of  the  B0X80  system. 

The  name  items  used  in  this  index  array  are  structured 
numbers.  Their  form  is  given  by  the  expression 
NAME  = I * 1000  + K 

in  which  I is  the  class  identifier  and  K is  one  of  the  follow- 
ing numbers: 

K = 100  for  Mean  vectors 
K = 201  for  Negative  deviation  vectors 
K = 202  for  Positive  deviation  vectors 
K - 30n  for  Husk  vectors,  n 

Distribution  Data  File  (DIST).  This  file  consists  of  two 
records.  The  first  is  a header  record.  The  second  is  a data 
record  which  contains  statistics  and  histograms  for  each  feature 
component's  values  as  they  occur  within  the  entire  population 
represented  by  the  FEAT  file.  The  DIST  file  is  generated  by  the 
CREATE  module.  Table  A-IV  describes  the  record  format  and  de- 
fines the  data  items  within  this  file.  The  FORMAT  module  processes 
this  data.  It  provides  graphic  displays  of  histograms  for  analy- 
tical use. 

Histogram  Data  File  (HIST).  This  file  consists  of  records 
generated  by  the  DEFINE  module.  HIST  file  data  records  are 
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processed  by  the  FORMAT  module  to  produce  graphic  displays  for 
analytical  use.  The  header  record  for  this  file  is  identical  in 
format  to  that  of  the  DIST  file.  The  data  records  of  this  file 
are  fixed  in  size,  but  have  a variable  format.  Size  is  fixed  to 
the  space  allocated  by  the  DEFINE  module.  The  record  format 
varies  in  order  to  allow  storage  of  histograms  output  by  the  proto- 
type revision  process  when  assymetric  classes  are  defined.  In 
the  symmetric  format  the  histogram  data  area  for  each  feature 
dimension  contains  one  set  of  eight  statistics  and  one  set  of 
interval  counters  which  store  the  histogram.  In  the  asymmetric 
format,  two  sets  of  statistics,  and  two  sets  of  histogram  counters 
are  maintained.  The  added  pair  of  sets  appears  within  the  array 
starting  at  the  position  indicated  by  the  variable  KTR.  The 
variable  NI  gives  the  number  of  histogram  intervals  maintained 

in  the  asymmetric  case.  The  variable  NBUC  gives  the  number  of 

v ' \ 

intervals  for  the  symmetric  case. 

Prototype  Data  File  (PROT).  This  file  consists  of  a set 
of  class  defining  data  records  which  are  encoded  in  a super- 
imposed  data  format.  The  latter  format  facilitates  transfer  of 
the  prototype  definitions  from  the  prototype  generation  segment 
of  the  B0X80  system  to  the  classifier  segment  of  the  system.  It 
is  described  in  Table  D-V II.  This  format  is  a data  standard 
(Ref  17)  on  the  INTEL  SBC  80/20  microprocessor  used  for  execution 
of  the  classifier  segment.  With  minor  variations,  it  is  also  used 
on  Motorola  6800  systems  (Ref  26).  The  format  for  the  actual 
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prototype  data  values  is  given  in  Table  A-V„  Each  class  definition 
in  this  prototype  format  consists  of  a string  of  values  which 
define  the  region  of  feature  space  assigned  to  the  class.  These 
values  include  the  prototype  mean  and  positive  and  negative  devia- 
tion values  for  each  feature  dimension.  Each  such  string  of 
values  is  preceded  by  a class  identifier. 

Feature  Vector  File  (FVEC).  This  file  consists  of  feature 
vector  components  and  vector  identifiers.  These  values  are 
ordered  for  processing  by  the  classifier  segment  of  the  B0X80 
system.  Data  in  this  file  is  encoded  in  the  hexadecimal  format 
presented  in  Table  D-V II.  The  structure  and  content  of  a feature 
vector  record  in  this  file  is  presented  in  Table  A-VI.  This  file 
is  produced  by  the  FORMAT  module  of  the  prototype  generation  seg- 
ment for  use  in  test  processing.  It  demonstrates  the  feature 
vector  structure  processed  by  the  classifier  segment. 

t 

Interpreter  Segment 

This  segment  consists  of  four  modules.  These  are  CREATE, 

DEFINE,  TRYOUT,  and  FORMAT.  Separate  subsections  define  the 
design  of  each  of  these  modules. 

CREATE.  This  module  builds  a file  (FEAT)  of  feature  vectors 
in  B0X80  system  format  for  later  system  processing.  A file 
(DIST)  containing  statistics  and  histograms  for  all  vector  com- 
ponents processed  may  be  output.  Various  vector  transforms  are 
possible.  Output  of  a statistics  report  as  well  as  certain 
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execution  trace  information  can  be  provided  on  the  LOGF  file. 
CREATE  functions  are  described  below  with  the  subroutine  which 
implements  them.  Fig  28  presents  a data  flow  chart  for  CREATE. 

A structure  diagram  is  given  in  Fig  29. 

CREATE  is  composed  of  four  major  functional  routines. 

These  are  DEFC,  SCAN,  COPY,  and  GETFEA.  The  first  three  are 
part  of  the  B0X80  system;  the  latter  is  intended  to  be  a user 
supplied  routine.  The  sequence  of  subroutine  calls  within  this 
module  and  the  input/output  parameters  for  specialized  CREATE 
subroutines  are  presented  in  Tables  V and  B-I.  Functional 
abstracts  of  these  routines  follow. 

(1)  DEFC  performs  module  initialization  functions:  file 
names  are  set,  and  user  options  are  requested,  error  checked  and 
set.  Memory  is  allocated  according  to  an  algorithm  which  sizes 
histograms  and  FEAT  file  record  blocks.  Space  is  allocated  as 
requested  to  a user  buffer  for  input  of  user  data  through  routine 
GETFEA. 

(2)  SCAN  accumulates  statistics  on  feature  vector  compon- 
ents obtained  from  the  user  data  file.  A summary  printout  is 
provided  at  user  option. 

(3)  COPY  processes  the  user  data  input,  and  builds  the 
output  FEAT  file.  Vector  transforms  are  exercised  at  user  option 
prior  to  output  of  FEAT  file  records.  Statistics  and  histograms 
are  generated  for  these  output  feature  vectors. 
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CREATE 


Structure  Diagram 


TABLE  V 


Sequence  of  Calls  CREATE  Process 


Routine 


Description 


CREATE 


*GETFEA 


Generates  FEAT  file  from  user  data 


Requests  and  sets  module  parameters 
Echoes  error  prompt  to  terminal 
Collects  statistics  on  users  data  set 
Reads  feature  vector  from  user  file 


STATX  Updates  statistics 

)PY  Copies  user  data  set  into  FEAT  file  form 

*GETFEA  Reads  feature  vector  from  user  file 

ENER  Computes  the  energy  in  a set  of  values 

STATX  Updates  statistics 

STATH  Collects  multivariate  histogram 


WRHIS 


Writes  HIST  file  record 


Underlined  routines  are  unique  to  CREATE 


(4)  GETFEA  is  a file  read  routine  supplied  by  the  user. 

Three  sample  GETFEA  routines  are  listed  in  Appendix  E.  Table  D-VIII 
summarizes  specifications  for  this  user  input  module  in  terms  of 
its  input/output  parameters. 

CREATE  processing  consists  of  initialization  followed  by 
a one  (or  two)  pass  process  through  the  user  data-file.  Vector 
transforms  which  can  be  selected  for  the  output  feature  vectors 
establish  standard  value  ranges  for  component  dimensions  which 
make  comparisons  of  interclass  histograms  convenient  by  effecting 
a linear  shift  of  component  values.  The  energy  normalization, 
unitzation  and  vector  shift  transforms  affect  feature  vector 
magnitudes  but  preserve  relative  angles.  The  squaring  transform 
varies  both  vector  magnitudes  and  angles  in  order  to  extract 
as  much  precision  from  vector  components  as  possible.  Control 
options  are  given  in  Table  B-I.  Outputs  to  the  LOGF  file  include 
a trace  of  subroutine  exits,  and  a dump  of  input  data  records  as 
well  as  printouts  of  data  base  statistics  and  histograms. 

Appendix  K contains  a sample  of  selected  LOGF  output.  Table  C-I 
briefly  summarizes  the  each  possible  LOGF  output. 

DEFINE.  This  module  generates  and  revises  the  CLAS  file. 

It  can  be  used  to  shuck  sets  of  feature  vectors  so  as  to  isolate 
the  kernel  of  patterns  most  acceptable  for  use  in  prototype  genera- 
tion. It  can  be  used  to  enlarge  the  structure  of  a CLAS  file 
so  as  to  allow  for  generation  of  sink-prototypes.  Prototypes 
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can  be  generated  singly  or  as  an  entire  set.  This  process  defines 
hyperrectangular  regions  in  feature  space  which  may  be  either 
symmetrical  or  asymmetrical  about  the  mean  of  a class  of  feature 
vectors.  The  primary  output  of  the  module  is  a CLAS  file.  Secondar- 
ily, a HIST  file  may  be  output.  This  will  contain  histogram 
records  for  each  class  of  feature  vectors  processed.  The  process 
of  shucking  sets  of  feature  vectors  is  supported  by  a graphic 
display  of  vector  component  overlap.  In  addition  to  this  support, 
a control  structure  and  dummy  calls  are  provided  at  points  appro- 
priate for  interactive  and  automatic  selection  of  husk  feature 
vectors.  Each  function  of  DEFINE  is  described  below  with  the 
subroutine  by  which  it  is  implemented.  Fig  30  presents  a data 
flow  chart  for  DEFINE.  Fig  31  presents  a structure  diagram. 

DEFINE  is  composed  of  three  major  subroutines  and  many 
supporting  utility  routines.  These  major  subroutines  are  NEXCLA, 
CLASSX,  and  CDEFI.  The  supporting  routines  unique  to  DEFINE  are 
DEFD,  ALLOC,  PHUSK,  KERPUT,  KERGET,  FANDER  and  SETLIM.  The 
sequence  of  subroutines  called  in  a simple  execution  of  this 
module  is  given  in  Table  VI.  the  parameters  for  subroutines 
unique  to  this  module  are  defined  in  Table  D-II.  An  abstract 
of  each  of  these  routines  is  given  below.  Routines  appear  in 
execution  sequence. 

(1)  DEFD  initializes  DEFINE.  Table  B-II  defines  input 
control  options  provided  by  this  routine.  Program  parameters 
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DEFINE 


DEFINE  Structure  Diagram 


Routine 

DEFINE 

DEED 

OPENX 

ALLOC 

ERR 

LOADC 

NEXCLA 

RFEAT 

ERR 

PHUSK 

KERGET 

RIX 

KERPUT 

RIX 

ADD 

DEL 

CLASSX 

INITC 

RIX 

ADD 

DEL 

RFEAT 

FANDER 

CDEFI 

KERGET 

RIX 

STATX 

STATH 

WRHIS 

PRCLAS 


TABLE  VI 

Sequence  of  Calls  in  DEFINE  Process 
Description 

Generates  CLAS  file  from  FEAT  records 
Defines  module  parameters 
Opens  FEAT  and  CLAS  files 
Allocates  memory  to  initial  CLAS  file 
Echoes  error  prompt  to  terminal 
Loads  existing  CLAS  file 

Sets  pointer  to  new  class  and  obtains  controls 

Reads  FEAT  record 

Echoes  error  prompt  to  terminal 

Prints  prototype  husk  list 

Gets  entry  from  husk 

Finds  husk  list  entry  in  CLAS  file 

Puts  entry  into  husk  list 

Finds  husk  list  entry  in  CLAS  file  index 

Adds  a husk  list  entry  to  CLAS  file 

Deletes  a husk  list  entry  from  CLAS  file 

Generates  a prototype  for  this  class  of  FEAT  data 

Initializes  index  to  CLAS  file  entries 

Finds  prototype  entries  in  CLAS  file  index 

Adds  prototype  entries  into  CLAS  file  index 

Deletes  prototype  entries  in  CLAS  file  index 

Reads  FEAT  record  for  this  class 

Produces  cluster  plot  of  feature  vectors 

Updates  prototype  component  definitions 

Gets  entry  from  husk 

Finds  husk  for  this  class 

Updates  statistics  for  this  class 

Updates  histograms  for  this  class 

Writes  HIST  file  record 

Prints  CLAS  file  record 


TABLE  VI  (Continued) 

Sequence  of  Calls  in  DEFINE  Process 

SETLIM  Inserts  feature  bounds  into  CLAS  file 

ADD  Adds  entry  to  CLASS  file  index 

RFEAT  Reads  FEAT  record 

WRCLAS  Writes  CLAS  file  to  disk 

Underlined  routines  are  unique  to  DEFINE 


initialized  by  DEFD  include  the  block  of  file  names  referenced 
by  all  input/output  statements  and  the  set  of  memory  parameters 
which  establishes  the  size  and  structure  of  the  CLAS  file. 

Comments  in  the  listing  of  this  routine  provided  in  Appendix 
clearly  define  these  parameters.  DEFD  controls  allocation  of  mem- 
ory to  the  CLAS  file  through  OPENX  (for  existing  files)  and  ALLOC 
(for  new  or  revised  files). 

(2)  ALLOC  uses  a set  of  statement  functions  to  allocate 
available  memory  to  the  CLAS  file  and  the  HIST  file.  An  iterative 
computation  of  available  memory  expands  three  file  parameters 
until  the  limit  is  met.  User  requests  to  allocate  space  for  extra 
prototypes  (variable  NE)  are  honored  first;  requests  for  histo- 
gram intervals  (variable  NBUC)  are  honored  next;  then,  requests 
for  space  for  prototype  husk  entries  (variable  MAXKV)  are  filled. 
If  changes  are  made,  user  approval  is  requested.  Disapproval 
aborts  the  module. 

(3)  NEXCLA  controls  optional  processing  of  each  class  of 
data.  Table  B-III  defines  its  input  controls.  Embedded  in  this 
routine  is  the  mechanism  which  allows  direct  user  assignment  of 
feature  vectors  to  the  husk  of  a class.  Multiple  passes  through 
each  FEAT  file  record  are  possible  through  an  option  in  this 
routine.  General  control  inputs  include  plotting  parameters,  as 
well  as  processing  function  selections.  The  primary  output  of 
the  routine  (variable  NEXC)  identifies  the  class  data  set 
about  to  be  processed. 


(4)  PHUSK  is  a support  routine  which  uses  KERGET  to  access 
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and  print  the  husk  list  associated  with  a prototype. 

(5)  KERGET  is  a support  function  which  uses  RIX  to  extract 
consecutive  husk  vectors  from  the  CLAS  file  entry  for  a given 
prototype.  The  next  stored  husk  entry  is  returned  on  successive 
calls  in  which  the  data  class  specification  remains  the  same. 

(6)  KERPDT  is  a support  routine  through  which  a short  list 
of  husk  vector  numbers  can  be  inserted  into  the  linked  list  of 
husk  vector  numbers  which  is  maintained  in  the  CLAS  file. 

(7)  CLASSX  is  the  primary  control  routine  within  DEFINE. 

If  the  user  has  selected  an  initialization  process,  CLASSX 
initializes  the  CLAS  file  using  INITC  and  ADD.  If  a follow-on 
process  has  been  requested,  CLASSX  establishes  prototype  locations 
within  the  CLAS  file,  and  allows  revision  of  those  addresses.  The 
major  cycle  of  CLASSX  provides  FEAT  records  to  CDEFI,  FANDER, 

SHUCK  or  PICKER  as  requested  by  control  parameters.  SHUCK  and 
PICKER  are  dummy  exits  for  either  automatic  or  interactive  graphic 
assignment  of  feature  vectors  to  the  husk  of  a class.  In  addi- 
tion to  this  control  process,  CLASSX  updates  the  current  HIST 
record  whenever  CDEFI  has  processed.  Both  an  exit  trace,  and  a 
trace  of  internal  computations  are  embedded  in  this  code. 

(8)  FANDER  produces  a plot  of  feature  vector  components. 

The  ordinate  of  this  plot  can  be  scaled  according  to  three  options 
requested  by  NEXCLA.  See  Table  B-III.  The  abcissa  of  this  plot 
consists  of  a set  of  discrete  locations,  one  for  each  dimension 
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of  the  feature  space.  Plotting  produces  a set  of  points  for  each 
feature  vector.  These  can  be  connected  to  suggest  the  character 
of  the  individual  feature  vector.  All  vectors  within  each  data 
block  of  a given  FEAT  file  record  can  be  accessed  and  plotted  by 
FANDER.  The  effect  is  that  of  a heavily  overlayed  set  of  line 
graphs  which  suggest  in  a single  display  the  degree  of  correlation 
and  the  variance  in  all  feature  dimensions.  The  combination  of 
this  picture  and  either  a listing  of  vector  components  or  the 
PICKER  routine  supports  shucking  unreasonable  feature  vectors 
from  the  FEAT  file  set  used  for  prototype  generation.  Figs  32 
and  33  provide  samples  of  FANDER  output. 

(9)  CDEFI  generates  prototypes.  At  each  call,  CDEFI 
processes  one  block  of  the  current  FEAT  file  record.  At  each 
exit  a prototype  exists  within  the  CLAS  file  which  reflects  all 
feature  vectors  processed  to  that  exit.  KERGET  is  used  to 
reject  from  this  process  any  feature  vectors  assigned  to  the 
husk  of  the  class.  A HIST  file  record  is  updated  at  each  call 

-f 

to  CDEFI  and  is  available  for  use  at  each  exit.  Both  an  exit 
trace  and  a log  of  intermediate  calculations  are  supplied. 

(10)  SETLIM  accesses  the  FEAT  file  trailer  record  to 
obtain  maximum  and  minimum  component  values  established  for  each 
dimension  of  the  feature  space  by  CREATE.  It  then  uses  ADD  to 
establish  a CLAS  file  entry  for  this  data,  and  updates  the  CLAS 
file.  These  global  component  limits  are  used  within  TRYOUT  and 
FORMAT  in  order  to  scale  feature  components  into  the  byte  sized 
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FIG  33.  FEATURES  CLUSTER  PLOT  - SAMPLE(B) 
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range  (0-255)  required  for  microprocessing. 

DEFINE  processing  has  three  major  paths.  The  initializa- 
tion path  is  followed  when  selected  as  an  option  at  module  start. 
Prototypes  can  be  initialized  only  as  a complete  set  one  to  one 
with  the  FEAT  file.  When  a CLAS  file  is  initialized  the  last 
record  of  the  FEAT  file  is  entered  into  the  CLAS  file.  This 
record  contains  scale  factors  for  the  feature  space  which  are 
used  in  other  DEFINE  paths  and  in  both  TRYOUT  and  FORMAT.  Thus 
an  initial  CLAS  file  is  a pre-requisite  for  all  other  DEFINE  pro- 
cessing paths.  The  regeneration  path  supports  selection  of  husk 
vectors  and  allows  definition  of  a new  prototype  without  processing 
those  vectors.  Prototype  husks  are  stored  in  the  CLAS  file; 
this  regeneration  process  can  be  a heuristic  iteration.  When  this 
path  is  followed,  specific  prototypes  may  be  selected  for  revision. 
A given  class  of  feature  vectors  (i.e.,  a record  front  the  FEAT 
file)  may  be  completely  processed  in  repetitive  iterations.  The 
third  path  allows  generation  of  asymmetric  prototypes;  its  process- 
ing parallels  that  of  the  regeneration  path.  A CLAS  file  is 
output  whenever  DEFINE  is  run.  A variety  of  selectable  outputs 
may  be  written  to  the  LOGF  file.  Appendix  K contains  a sample 
LOGF  file  produced  by  DEFINE.  Table  C- III  briefly  describes  the 
contributions  of  each  routine  to  this  output.  A list  of  terminal 
outputs  is  presented  in  Table  C-II.  Output  messages  are  described 
in  each  table  in  their  approximate  order  of  appearance  during 
program  execution. 
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TRYOUT.  This  module  performs  a trial  classification  of 


the  feature  vectors  within  a given  FEAT  file.  It  can  be  used  to 
evaluate  the  acceptability  of  a given  CLAS  file.  Additionally,  it 
can  be  used  to  estimate  the  relative  merit  of  individual  feature 
dimensions  and  to  construct  from  the  original  feature  space  a sub- 
space within  which  classification  is  optimal.  The  primary  output 
of  this  module  is  a summary  statement  of  classification  error  rate. 
Input  options  extend  this  statement  to  a confusion  matrix  format, 
and  to  a set  of  error  rates  for  each  of  a set  of  vested  subspaces 
of  the  original  space.  A secondary  output  is  a revised  version  of 
the  input  CLAS  file.  This  revision  reflects  both  scaling  and 
zapping  of  prototype  components.  These  and  the  other  functions 
of  TRYOUT  are  described  below  with  the  subroutine  which  implements 
them.  Fig  34  presents  a data  flow  chart  for  TRYOUT.  Fig  35 
presents  a structure  diagram. 

TRYOUT  is  composed  of  seven  major  functional  subroutines. 
These  are  DEFT,  MERIT,  SUBSET,  FIGM,  EVAL,  LOOK  and  DOCU.  The 
sequence  of  subroutines  called  as  TRYOUT  is  executed  is  listed  in 
Table  VII.  The  input  and  output  parameters  for  unique  TRYOUT 
subroutines  are  defined  in  Table  D-III.  Each  is  synopsized  below. 

(1)  DEFT  initializes  TRYOUT.  User  inputs  are  obtained 
to  define  selectable  options.  Table  B-VI  defines  controls  input 
by  DEFT.  Both  CLAS  and  FEAT  files  are  opened,  and  memory 
allocation  parameters  are  set  and  checked  against  the  system 
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35.  TRYOUT  Structure  Diagram 


TABLE  VII 


Routine 

TRYOUT 

DEFT 

OPENX 

ERR 

LOADC 

INDEX 

RIX 

XSCAL 

PRCLAS 

MERIT 

SUBSET 

LOADC 

INDEX 

RIX 

XSCAL 

PRCLAS 

ERR 

IASORT 

FIGM 

FDSORT 

EVAL 

RFEAT 

XSCAL 

LOOK 

DOCU 

PRCLAS 

RFEAT 

WRCLAS 

♦Under! ined 


Sequence  of  Calls  in  TRYOUT  Process 
Description 

Classifies  FEAT  records  against  CLAS  file 
Defines  module  control  parameters 
Opens  FEAT  and  CLAS  files 
Echoes  error  prompt  to  terminal 
Loads  CLAS  file 

Builds  special  index  to  CLAS  file 
Reads  CLAS  file  index 

Scales  prototype  into  specified  value  range 
Prints  CLAS  file 

Computes  a figure  of  merit  for  each  dimension 
Controls  prototype  component  zapping 
Loads  CLAS  file 

Rebuilds  special  index  to  CLAS  file 

Reads  CLAS  file  index 

Scales  prototypes  as  specified 

Prints  CLAS  file 

Echoes  error  prompt  to  terminal 

Sorts  zap  tags  to  ascending  order 

Requests  subspace  id  or  subspace  tags 

Sorts  subspace  tags  into  descending  order 

Classifies  a given  feature  vector  against  CLAS 

Reads  FEAT  file 

Scales  FEAT  vectors  into  given  range 

Computes  feature  subspace  error  rates 

Documents  classification  error  rates 

Prints  CLAS  file 

Reads  FEAT  file 

Writes  CLAS  file  to  disk 

routines  are  unique  to  TRYOUT 
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limit.  DEFT  sets  'NAMES',  the  common  block  of  file  names  used 
by  TRYOUT. 

(2)  MERIT  computes  five  sets  of  figures  of  merit  for  the 
feature  components  represented  in  the  CLAS  file  prototypes. 

Three  of  these  are  intermediate  computations  used  nowhere  else. 
Feature  evaluation  algorithms  (see  chapter  4)  are  used  to  compute 
the  output  sets  of  merit  figures. 

(3)  SUBSET  is  a control  subroutine  which  requests  user 
inputs  to  direct  the  process  of  feature  zapping.  This  process 
expands  specified  feature  deviation  values  within  a given  pro- 
totype. The  effect  is  elimination  of  the  specified  feature  com- 
ponent from  the  classification  process.  Table  B-IV  defines  user 
inputs  to  SUBSET.  Figs  18  to  23  present  a sample  execution  of 
TRYOUT  showing  some  of  these  inputs. 

(4)  FIGM  is  a control  subroutine  through  which  the  user 
selects  which  set  of  merit  figures  are  to  be  used  in  production 
of  an  extended  set  of  classification  error  rates.  Table  B-V 
specifies  control  inputs  to  FIGM.  A list  of  feature  dimensions, 
considered  to  define  a set  of  nested  subspaces,  is  passed  from 
FIGM  to  LOOK  which  computes  the  subspace  error  rate  statements. 
Refer  to  Appendix  K for  a sample  operation  of  FIGM. 

(5)  EVAL  is  the  core  subroutine  of  TRYOUT.  It  controls 
reading  of  the  FEAT  file  and  classifies  each  feature  vector 
within  this  file  against  prototypes  within  the  CLAS  file.  The 
B0X80  decision  rule  is  implemented  so  as  to  admit  prototypes 
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whose  components  have  been  scaled  into  the  range  0-256.  This 
allows  simulation  of  the  processing  within  DECIDE,  the  80/20 
classifier.  Additionally  an  option  allows  the  user  to  elect 
use  of  the  Euclidean  norm  rather  than  the  Sup  norm  within  the 
decision  process.  EVAL  outputs  a summary  of  error  rates  and  a 
confusion  matrix.  It  also  controls  execution  of  LOOK. 

(6)  DOCU  is  an  output  format  routine.  The  summary 
performance  error  rate,  the  confusion  matrix,  and  the  list  of 
subspace  error  rates  are  printed  by  DOCU. 

(7)  LOOK  analyzes  each  distance  vector  computed  within 
the  classification  algorithm.  The  components  of  this  vector  are 
re-ordered  according  to  the  list  of  subspace  tags  provided  by 
FIGM.  Then  each  nested  subvector  is  classified  within  its  sub- 
space and  error  rates  are  recorded  for  later  output  by  DOCU. 

There  is  a tight  interface,  that  is,  there  are  no  subroutine 
parameters  and  there  is  significant  interlacing  of  common  blocks, 
tieing  this  routine  to  EVAL  and  to  DOCU.  This,  since  LOOK  is 
called  inside  the  inner  most  loop  of  TRYOUT. 

TRYOUT  processing  consists  of  an  initialization  sequence 
and  an  evaluation  cycle.  In  the  former,  DEFT,  OPENX,  INDEX  and 
MERIT  establish  processing  options  and  parameters,  load  and  scale 
the  CLAS  file  if  opted,  and  compute  MERIT  figures.  Control  inputs 
to  this  sequence  are  shown  in  Table  B-VI.  In  the  latter,  FIGM, 
SUBSET,  EVAL,  LOOK,  and  DOCU  allow  the  user  to  modify  the  feature 


140 


dimensions  used  in  the  classification  process,  and  then  perform 
and  document  that  process.  A revised  CLAS  file  is  output  at 
end  of  job  whenever  the  CLAS  file  has  been  zapped  via  SUBSET.  A 
variety  of  selectable  outputs  may  be  written  to  the  LOGF  file. 

These  are  triggered  by  the  standard  control  option  (L,T,Y)  and 
by  the  TRYOUT  control  option  (C,A).  Appendix  K contains  a sample 
LOGF  file  produced  by  TRYOUT.  The  contributions  of  each  routine 
to  this  LOGF  output  are  summarized  in  Table  C-IV  in  the  approxi- 
mate order  of  their  generation. 

FORMAT.  This  module  produces  several  formats  of  data 
within  each  of  the  B0X80  system  files.  Its  primary  purpose  is 
the  production  of  the  PROT  and  FVEC  files  in  hexadecimal  paper 
tape  line  format  for  input  to  8080  microprocessor  systems. 
Secondarily  displays  of  CLAS,  FEAT  and  HIST  records  are  produced 
on  the  TEKTRONIX  4014  terminal  screen.  The  module  is  designed  to 
produce  two  display  formats.  The  strip  chart  format,  which  is 
only  stubbed  into  the  code,  is  intended  to  allow  precise  examina- 
tion of  ordinate  and  abcissa  data  values  for  individual  prototypes, 
feature  vectors  and  feature  histograms.  The  picture  format 
presents  a top  level  three-dimensional  presentation  of  global 
data  variation  within  sets  of  prototypes,  feature  vectors  and 
feature  histograms.  These  and  the  other  functions  of  FORMAT  are 
described  below  with  the  subroutine  which  implements  them. 

Fig  36  presents  a data  flow  chart  for  FORMAT.  Fig  37  contains  a 
structure  diagram,  while  Tables  B-VII  and  B-V III  show  input 
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TABLE  VIII  (1/2) 

Sequence  of  Calls  in  FORMAT  Process 


T 


) 
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Routine 

FORMAT 

DEFT 

OPENX 

OPENH 

ERR 

LOADC 

INDEX 

RIX 

XSCAL 

PRCLAS 

NEXREC 

ERR 

XFEAT 

RFEAT 

NEXVEC 

ERR 

PICT 

PLOT3D 

PLX 

STRIP 

XSCAL 

XMIT 

ILINE 

XCLAS 

XMIT 

ILINE 

PICT 

PLOT3D 

PLX 

STRIP 


I 

Description 

Produces  output  format  from  B0X80  file 
Defines  module  control  parameters 
Opens  CLAS  file,  and  FEAT  file  if  opted 
Opens  HIST  file 

Echoes  error  prompt  to  terminal 
Loads  CLAS  file 

Builds  table  to  index  CLAS  file;  may  scale  CLAS 
Finds  CLAS  file  index  entries 
Scales  vector  components  to  stated  range 
Prints  CLAS  file 

Requests  user  input  of  next  data  class 

Echoes  error  prompt  to  terminal 

Controls  processing  each  FEAT  file  record 

Reads  FEAT  file  record  blocks 

Requests  user  choose  specific  FEAT  vectors 

Echoes  error  prompt  to  terminal 

Sets  up  for  3-D  plot 

Hidden  line  routine  draws  feature  vectors 

Emulates  CALCOMP  PLOT  routine 

Stub  for  feature  vector  strip  chart  function 

Scales  vector  components  into  stated  range 

Drives  hexadecimal  line  format 

Produces  hexadecimal  line  output 

Controls  processing  each  CLAS  prototype 

Drives  hexadecimal  line  format 

Produces  hexadecimal  line  output 

Sets  up  for  3-D  plot  of  prototype  data 

Hidden  line  routine  draws  prototype  boundaries 

Emulates  CALCOMP  PLOT  routine 

Stub  for  feature  vector  strip  chart  function 


144 


A 


TABLE  VIII  (2/2) 

Sequence  of  Calls  in  FORMAT  Process 


Routine 


XHIST 


RHIST 

FILBUF 


PL0T3D 


STRIP 


Description 

Controls  processing  HIST  file  records 
Reads  HIST  file  records 
Builds  buffer  for  3-D  plot 
Sets  up  for  3-D  plot  of  histograms 
Hidden  line  routine  draws  nistograms 
Emulates  CALCOMP  PLOT  routine 
Stub  for  histogram  strip  charting 


^Underlined  Routines  are  unique  to  FORMAT 
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options  and  controls.  Table  C-V  summarizes  outputs  in  the  approx- 
imate order  of  their  appearance  on  the  LOGF  file.  Table  VIII  shows 
the  sequence  of  subroutine  calls  executed  in  operation  of  FORMAT. 

FORMAT  is  composed  of  five  major  functional  routines  and 
several  unique  supporting  routines.  These  are  described  in  the  para- 
graphs that  follow.  The  input/output  parameters  for  these  unique 
routines  are  defined  in  Table  D-IV. 

(1)  DEFF  initializes  control  parameters  for  FORMAT  and 
allocates  available  memory  to  buffers  and  tables.  Input  options 
allow  selection  of  a file  data  source  and  choice  of  a processing 
option.  FEAT,  CLAS,  and  HIST  files  may  be  input.  Process  options 
are  transmit,  picture  and  stripchart.  The  selected  source  data 
file(s)  are  initialized  via  subroutine  call  from  this  module. 

(2)  XCLAS  directs  processing  of  CLAS  file  data.  This 
routine  has  three  data  paths.  When  the  transmit  option  has  been 
selected,  XCLAS  formats  a PROT  file  with  the  non-zapped  components 
of  selected  prototypes  and  prepares  a count  of  the  dimensionality 
of  the  prototype  space  represented  by  the  PROT  file.  If  either 
stripchart  or  picture  options  have  been  chosen,  a buffer  is 
filled  and  output  to  the  appropriate  routine  when  full. 

(3)  XFEAT  controls  the  processing  of  FEAT  file  data.  A 
special  routine  (NEXVEC)  allows  selection  of  specific  feature 
vectors.  These  are  either  output  for  transmission  as  hexa- 
decimal data  lines,  or  loaded  into  the  buffer  used  for  data 
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display.  When  transmission  has  been  opted,  only  values  of  non- 
zapped  features  are  processed. 

(4)  XHIST  handles  the  flow  of  HIST  file  data  to  the  dis- 
play buffer.  A special  subroutine  (FILBUF)  does  the  actual 
movement  of  data  items.  A utility  routine  (RHIST)  provides  access 
to  the  HIST  file.  When  a DIST  file  (a  single  record  HIST  file 
produced  by  CREATE  recording  universe  distributions)  has  been 
input,  XHIST  sets  special  processing  parameters. 

(5)  XMIT  is  the  controlling  driver  for  the  hexadecimal 
format  routine. 

(6)  PICT  is  the  controlling  driver  for  the  3-D  plot  rou- 
tine. It  requests  and  sets  plot  scaling  parameters,  controls 
repetitive  displays,  and  initializes  TEKTRONIX  graphics. 

(7)  STRIP  is  the  stub  for  a routine  which  should  initialize 
TEKTRONIX  graphics,  and  label  and  output  a set. of  stripchart 
plots  with  scaled  axes. 

(8)  FILBUF  passes  HIST  record  data  to  STRIP  and  PICT.  It 
allows  a L.0GF  file  printout  of  feature  distributions  and  statis- 
tics as  well. 

(9)  NEXREC  is  a control  subroutine  through  which  the  user 
selects  classes  of  data  for  processing. 

(10)  NEXVEC  is  a control  subroutine  through  which  the 
user  identifies  (sets  of)  feature  vectors  for  processing. 
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Format  processing  begins  with  the  system  standard  initiali- 
zation during  which  the  selected  data  file  is  opened.  A major 
cycle  through  each  data  class  can  be  automatic  at  the  request  for 
each  desired  class.  Classes  of  data  must  be  processed  in  ascend- 
ing order  by  class  number.  When  the  transmission  option  is 
elected,  only  one  address  may  be  specified  for  the  output  PROT 
or  FVEC  file.  However,  when  picture  or  stripchart  options  are 
selected  multiple  display  outputs  are  possible  so  that  differently 
scaled  presentations  can  be  viewed.  Similarly,  when  FCAT  files 
are  processed,  a given  data  class  may  be  processed  repetitively 
so  that  different  sets  of  feature  vectors  may  be  output.  This 
should  aid  in  the  selection  of  a kernel  of  feature  vectors  from 
which  to  define  a class  archetype.  Output  to  the  LOGF  file  is 
minimal,  consisting  mainly  of  journal  entries  of  user  inputs. 
However,  a format  print-out  of  feature  statistics  is  provided. 
Appendix  K contains  a sample  LOGF  file  produced  by  FORMAT. 

Classifier  Segment 

This  segment  consists  of  two  modules.  These  are  TAPEIN 
and  DECIDE.  The  former  is  a support  module.  The  latter  imple- 
ments the  B0X30  system  classifier.  They  are  described  in  the 
following  subsections. 

TAPEIN.  This  module  loads  PROT  and  FVEC  files  into 
microprocessor  RAM  in  order  to  set  up  data  buffers  for  execution 
of  the  DECIDE  module.  The  module  is  dependent  upon  service 
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routines  within  the  SBC  80/20  ISIS  1.0  monitor.  It's  design  is  I 

based  upon  the  ISIS  routine  which  implements  the  SBC  80/20  "R" 
command  (Ref  19).  Output  from  the  module  is  simply  a block  of 
RAM  locations  which  are  loaded  with  the  data  contained  on  an 
input  cassette  tape.  Timing  and  control  variations  between 
paper  tape  and  cassette  tape  readers  necessitated  the  module. 

TAPEIN  is  used  as  a utility  of  the  SBC  80/20.  It  is 

i 

executed  according  to  procedures  detailed  in  Table  B-IX.  Data  are 
input  to  the  TAPEIN  module  on  a cassette  tape  produced  by  copying 
a PROT  file  generated  by  the  Interpreter  Segment.  The  proce- 
dures for  generating  this  file  are  shown  in  Table  B-X. 

DECIDE.  This  module  classifies  feature  vectors.  It 
executes  within  SBC  80/20  RAM  and  references  RAH  locations  to 
obtain  both  class  definitions  and  pattern  feature  vectors. 

Outputs  from  this  module  are  a decision  by  decision  record  of 
class  assignment,  and  a summary  count  of  correct  and  incorrect 
decisions.  The  module  is  designed  to  be  a model,  and  not  to  be 
a packaged  subroutine.  Thus,  its  initialization  requires  the 
user  to  manually  set  80/20  RAM  with  control  values.  For  inter- 
preter testing  these  initializations  are  duplicated  by  references 
to  assembler  symbols,  which  should  be  set  before  assembly.  These 
initial  values  specify  the  dimensionality  of  the  feature  space 
(JD),  the  number  of  pattern  classes  ( IC ) , and  the  number  of 
feature  vectors  in  the  data  block  to  be  processed  (LB). 
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DECIDE  is  intended  to  be  used  as  a supporting  process 
within  one  of  a pair  of  microprocessors  which  communicate  via  a 
common  buss  and  a central  RAM.  The  primary  processor  acquires 
data,  and  generates  feature  vectors.  As  each  vector  is  produced, 
the  secondary  processor  is  interrupted,  and  the  vector  is  placed 
in  RAM.  The  secondary  processor  is  triggered  when  the  first 
vector  is  entered  into  the  RAM  data  block.  It  continues  to 
execute  DECIDE,  producing  classification  decisions,  until  this 
data  block  is  empty. 

A priori  knowledge  of  test  feature  vector  classification 
is  reflected  in  DECIDE  output.  The  score  keeping  element  of  the 
DECIDE  process  should  be  deleted  in  any  actual  implementation. 
This  code  is  located  in  code  paragraph  0A5,  and  is  shown  in  the 
flow  chart  in  Fig  38.  Figs  39  and  40  present  the  data  flow  and 
structure  within  DECIDE. 
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Fig.  40.  DECIDE  Structure  Diagram 


V I . Conclusions  and  Recommendations 

This  thesis  has  presented  a development  system  for  micro- 
processor based  pattern  recognizers.  Two  system  segments  were 
implemented.  These  satisfy  the  functional  requirements  established 
for  the  system.  The  algorithms  developed  for  the  system  were 
defined  and  were  illustrated  in  the  preceding  chapters.  The 
design  of  the  computer  program  modules  which  comprise  the  system 
was  described  in  Chapter  V.  A performance  evaluation  was  provided 
for  the  system  through  a series  of  benchmark  experiments.  Specific 
conclusions  and  a set  of  recommendations  are  now  provided  in  the 
following  sections. 

Cone! usions 

The  B0X80  system  provides  a framework  for  experimentation. 

It  can  be  used  to  configure  a pattern  classifier  which  forms  one 
node  of  a two-part  microprocessor  based  pattern  recognizer.  The 
Classifier  Segment  of  the  B0X80  system  has  been  tested  by  simula- 
tion. This  testing  has  shown  that  the  classifier  algorithm  can 
indeed  produce  recognition  decisions  with  an  acceptably  low  error 
rate.  The  Interpreter  Segment  of  the  B0X80  system  has  been 
demonstrated  by  experiment.  Class  defining  structures  have  been 
generated  and  trial  performance  has  been  measured.  The  contrast 
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of  this  performance  to  independent  experiments  using  the  same 
data  has  shown  that  the  Interpreter  Segment  can  support  accurate 
pattern  recognition.  The  algorithms  used  in  this  latter  segment 
include  a non-parametric,  weighted,  minimum-distance  classifi- 
cation procedure,  and  a manually  controlled  feature  selection 
technique. 

The  classifier  algorithm  was  shown  to  be  capable  of  performance 
approximately  equivalent  to  that  obtained  from  OLPARS'  and  SPSS' 
algorithms.  This  performance  in  fact  exceeds  that  of  previous 
AF1T  experiments  with  benchmark  data  sets  (Refs  24,  33)  and  veri- 
fies simulated  alphabet  classification  error  rates  projected  by 
Tallnan  (Ref  35).  Although  suboptimal,  this  classifier  algorithm 
/ery  efficient.  Existing  AFIT  programs,  and  even  the  SPSS 
5>ystem,  require  far  more  memory  for  class  defining  data  structures 
than  the  30X80  classifier  requires.  One  execution  time  compari- 
son showed  a 2:1  run  time  improvement.  The  concept  of  micro- 
processor development  relies  upon  the  use  of  byte-scaled  features. 
Experiments  with  both  the  FOBW  and  the  alphabet  data  showed  a less 
than  one  percent  average  increase  in  errors  when  the  classifier 
algorithm  operated  on  these  byte  scaled  integer  values. 

The  feature  selection  algorithm  was  shown  to  be  comparable 
to  the  OLPARS  procedure.  Although  possibly  more  difficult  to 
use,  the  B0X30  procedure  is  more  flexible  than  that  of  OLPARS. 
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The  minimum  error  rate  produced  by  the  B0X80  system  is  equivalent 
to  that  produced  with  OLPARS'  NMV  classifier.  This  comparison  is, 
of  course,  highly  data  dependent.  The  B0X80  system  feature 
selection  algorithm  chose  a best  series  of  nested  feature  subsets  for 
classification  of  the  alphabet  data.  The  series  of  associated 
error  rates  decreased  monotonical ly  and  asymptotically.  The 
error  rate  for  each  of  these  nested  feature  subsets  was  lower  than 
the  error  rate  for  every  other  tested  feature  subset  of  the  same 
size.  The  final  subspaces  selected  for  the  alphabet  and  the  FOBW 
data  sets  each  produced  error  rates  less  than  or  equal  to 
the  lowest  error  rates  obtained  by  previous  AFIT  experimenters. 

Note  that  these  previous  experimenters  used  two  and  seven  times 
as  many  features  for  their  lowest  error  rates  as  were  used  in  the 
comparable  B0X80  tests. 

The  Interpreter  Segment  of  the  B0X80  system  embodies 
processing  capabilities  which  have  not  yet  been  fully  explored. 

The  CREATE  module  has  options  for  input  data  transforms  which 
were  not  experimentally  evaluated.  The  DEFINE  module  has  the 
necessary  data  structure  to  support  editing  the  training  data  set 
so  as  to  define  class  structures  based  on  analytically  selected 
class  kernels.  Subroutine  stubs  are  indicated  but  not  provided  for 
an  automatic  editing  capability.  The  TRYOUT  module  allows  selec- 
tion of  partially  disjoint  feature  subsets  for  each  data  class. 

Data  processing  structure  for  generation  of  rejection  rates 
exists.  The  FORMAT  module  has  indicated  but  not  provided  sub- 
routine stubs  for  strip  chart  graphics  presentations  of  histogram, 
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and  feature  vector  data.  None  of  these  capabilities  was  required 
of  the  Interpreter  Segment.  The  conclusion  here  is  that  a signi- 
ficant capacity  for  enhanced  capability  is  deliberately  designed 
into  the  system. 

Finally,  the  B0X80  system  is  transportable  as  required.  This 
fact  is  not  explicitly  shown.  However,  ANSI  code  conventions 
were  followed.  Design  is  modular  and  data  structures  are  sized  by 
the  user.  The  use  of  independent  modules  related  by  standard 
files  supports  the  transportabi 1 i ty  of  this  code.  This  transport- 
ability and  the  economy  of  its  algorithms  make  the  B0X80  system  a 
potentially  valuable  tool  for  the  development  of  microprocessor 
based  pattern  recognizers. 

Recommendations 

A host  of  general  suggestions  are  possible.  One  outweighs 
all  others.  The  system  should  be  used  in  an  experimental  develop- 
ment of  a waveform  pattern  recognizer.  The  systems  design  for 
this  experiment  should  address  the  all-important  problem  of  gener- 
ating a design  data  sample  which  adequately  represents  the  pattern 
environment.  Local  research  facilities  have  supported  experiments 
of  this  type  which  have  processed  electrocardiographic  data. 

Because  of  this  ready  availability,  this  data  should  be  used  for  a 
first  experiment  with  the  BOXOO  system.  A list  of  more  specific 
recommendations  follows. 
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(1)  Error  rates  achievable  with  the  asymmetric  classifi- 
cation option  should  be  experimentally  compared  to  those  achiev- 
able with  the  symmetric  process. 

(2)  The  TRYOUT  module  should  be  modified  to  experiment 

i 

with  the  use  of  reject  boundaries.  A constant  boundary  level 
should  be  used  for  all  classes  at  first.  Then  unique  boundaries 

should  be  used  for  individual  classes.  < 

(3)  The  capabilities  of  the  DEFINE  module  for  edit  selec- 
tion of  husk  feature  vectors  should  be  explored.  . 

(4)  A new  module,  MODIFY,  should  be  produced  to  investigate 
formation  of  synthetic  classes.  These  should  be  formed  between 

i 

classes  whose  members  are  easily  mistaken  as  indicated  by  confusion 
matrix  output.  This  module  should  present  interclass  distance 
measures  in  graphics  and  tabular  form.  These  measures  should  be 
designed  to  qualify  the  effect  of  selecting  kernel  patterns  on 
the  variances  and  dispersions  of  individual  features. 

(5)  The  DEFINE  module  should  be  modified  to  investigate 
mode  based  class  defining  structures. 

All  of  the  above  experimental  modifications  should  use 
the  alphabet  data  set  produced  by  Sponaugle  as  a standard  test 
data  set.  The  value  of  the  Fourier  transform  features  recorded  on 
that  data  set  should  be  further  qualified  by  a classification 
experiment  using  the  31  space  vectors  generated  by  Sponaugle. 
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A tool  for  developing  microprocessor  based  pattern  recognizers  is  presented. 
A two  segment  system  of  programs  is  implemented.  One  segment  is  a subsystem 
consisting  of  a generalized  pattern  classifier  program  and  utility  routines 
for  an  INTEL  SBC  80/20  microprocessor  system.  The  other  segment  is  a 
subsystem  of  four  interactive  programs.  These  four  programs  support  feature 
selection,  pattern  class  definition  and  performance  evaluation  using 
procedures  fitted  to  the  classifier  algorithm.  This  subsystem  operates  on  a 
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user  supplied  file  of  feature  vectors.  It  produces  a class  defining 
structure  for  use  by  the  classifier.  It  can  use  a TEKTRONIX  4014  for 
graphics  support  and  will  operate  interactively  within  the  CDC  6600  Inter- 
com partition.  Structured  design,  modular  code,  buffer  allocation  algor- 
ithms, and  ANSI  standard  FORTRAN  code  make  this  segment  transportable.  The 
classifier  segment  requires  an  8080  system.  Less  than  256  bytes  of  ROM  are 
used.  Data  buffer  locations  and  sizes,  the  number  of  classes  and  the 
number  of  features  are  specified  by  the  user.  Experiments  produced 
estimates  of  classifier  performance  for  this  system.  An  error  rate  of 
less  than  ten  percent  is  reported  for  one  26  class  character  recognition 
experiment. 
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